This can come in handy if you're working with large Python objects and you want to be certain that you're either unit testing everything you retrieve or certain that all the data you draw from a database is actually used in a report.
For example, you might have a SELECT fieldX, fieldY, fieldZ FROM ...
SQL query, but in the report you only use fieldX, fieldY
in your CSV export.
class TrackingDict(dict):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self._accessed_keys = set()
def __getitem__(self, key):
self._accessed_keys.add(key)
return super().__getitem__(key)
@property
def accessed_keys(self):
return self._accessed_keys
@property
def never_accessed_keys(self):
return set(self.keys()) - self._accessed_keys
Example use case:
user = {
"name": "John Doe",
"age": 30,
"email": "jd@example.com",
}
user = TrackingDict(user)
assert user["name"] == "John Doe"
print("Accessed keys:", user.accessed_keys)
print("Never accessed keys:", user.never_accessed_keys)
This will print
Accessed keys: {'name'}
Never accessed keys: {'email', 'age'}
This can be useful if you have, for example, a pytest
test that checks all the values of a dict object and you want to be sure that you're testing everything. For example:
assert not user.never_accessed_keys, f"You never checked {user.never_accessed_keys}"
UPDATE (June 16)
Here's a typed version, by commenter Michael Cook:
from typing import TypeVar, Any
K = TypeVar('K')
V = TypeVar('V')
class TrackingDict(dict[K, V]):
"""
A `dict` that keeps track of which keys are accessed.
"""
def __init__(self, *args: Any, **kwargs: Any) -> None:
super().__init__(*args, **kwargs)
self._accessed_keys: set[K] = set()
def __getitem__(self, key: K) -> V:
self._accessed_keys.add(key)
return super().__getitem__(key)
@property
def accessed_keys(self) -> set[K]:
return self._accessed_keys
@property
def never_accessed_keys(self) -> set[K]:
return set(self.keys()) - self._accessed_keys
Comments
Post your own commentWhat would it look like with type hints?
What do you think it cutoff look like?
If you can make a version that you think will work, we can update the blog post.
How about this?
from typing import TypeVar, Any
K = TypeVar('K')
V = TypeVar('V')
class TrackingDict(dict[K, V]):
"""
A `dict` that keeps track of which keys are accessed.
"""
def __init__(self, *args: Any, **kwargs: Any) -> None:
super().__init__(*args, **kwargs)
self._accessed_keys: set[K] = set()
def __getitem__(self, key: K) -> V:
self._accessed_keys.add(key)
return super().__getitem__(key)
@property
def accessed_keys(self) -> set[K]:
return self._accessed_keys
@property
def never_accessed_keys(self) -> set[K]:
return set(self.keys()) - self._accessed_keys
Blog post updated, thanks to you! Thanks!
Cursor effects are annoying. I would discourage their use.
Interesting concept! I'm experimenting with this have and have noticed a few potential issues. Currently TrackingDict doesn't respond to the .get() method, an easy fix there might be inheriting from UserDict or refactoring to implement a MutableMapping? Another thing I noticed is that key lookup misses also get added to the _accessed_keys set. ex: a lookup to user["address"] wrapped in a try/except KeyError. This might be the desired behavior though?
That's great points! The origin place where I wanted and needed this was only using `__getitem__` so I didn't think of `.get()`.