Knowledge database





Logic issues

L9 Bad design 9% of all issues

Seems like things could be organized in a better way.

L50 Reinventing a wheel 5% of all issues

Whatever you want to code, usually someone has already done that. And usually there are some tools to cover your case. And usually you can find them using Google.

L22 Function with side effects 2% of all issues

Side effects are evil: it's very hard to track what function does under the hood, because function signature does not reflect side effects. Always try to write "pure" functions, i.e. those which receive inputs, do something internally, and return output. You should be able to run such function million times and get exactly the same result with nothing changed in the environment. However, side effects are totally OK if that's the only thing the function does.

L39 Using generic exception 2% of all issues

Exceptions should be easy to catch. If your code throws only Exception or ValueError, then it's very hard to catch specific errors, because all thrown exception classes are the same. Create application-specific exceptions, so that every logical error has its own exception class: class VerySpecificException(Exception): pass

L20 Hardcoded value 1% of all issues

If some value is hidden somewhere in the middle of code, it will be very hard to find it later. Better solutions: 1) use environment variables 2) use constants in module's top section 3) use class variables.

L29 Should be class variable 1% of all issues

Key constants of a class should be so-called "class variables" instead of using/defining them inside methods. If you create a child class and want to change those variables, you'll have to rewrite entire methods. With class vars, you only have to overwrite those class vars in child. Example: class MyClass: SOME_VALUE = 1 and class MySubclass(Class): SOME_VALUE = 2

L56 Catching generic exception 1% of all issues

Catching Exception or everything means that you don't understand what the code does. So you may catch and suppress exceptions which you probably didn't want to catch at all but you weren't aware of them. It is better for your program to fail, rather then catching everything.

L42 Too big "try" block 1% of all issues

You should include as few lines of code inside try block as possible. First, you may accidentally catch exceptions from lines which you didn't want to be caught. Second, the smaller try ... except blocks are, the less cognitive load it makes.

L26 Executing code in global scope 1% of all issues

When you have some code in module scope, the code will be executed when someone imports that module - which makes the module completely non-reusable. Maybe it's not meant to be reusable today, but you'll have to refactor if you need to import the module later. Use if __name__ == '__main__' to detect whether module was imported or used as a standalone script.

L17 Using global variable 1% of all issues

Global variables have one drawback: they are global. Which means that any snippet of code which relies on such global variables also becomes global, i.e. you can't run two copies of that code simultaneously. Also, global variables prevent you from writing clean tests where you can change input parameters and verify that your program behaves correctly - because some input parameter is a global variable.

L38 Not using isinstance() 1% of all issues

Idiomatic python type check is if isinstance(type, variable) or if isinstance((type1, type2), variable). Using type(variable) == SomeClass smells because it won't work if variable is a subclass of SomeClass.

L31 Eval 1% of all issues

You. Should. Never. Use. Eval. Executing code from a variable is very dangerous. Almost always there is a better option than using eval. Usually all you need is either getattr() or google.

L33 Request without timeout 1% of all issues

Surprise! Python's requests module does not have a timeout by default! So if you don't use the timeout parameter when making a request and if remote resource hangs forever (or is very slow), your code will hang, too. Always use reasonable timeouts. A good value may be 5 seconds.

L43 Could be one-liner 1% of all issues

One-liners' benefit is that they do one action in one line. This means that 1) your code will take less lines of code (which improves readability) 2) there will be only one assignment to a variable (a = x if b else y] instead of if b: a = x else: a = y), which is less error-prone 3) if using list/dict/etc comprehensions, you simply write a faster code because they are fast

L52 Not chaining exceptions 1% of all issues

When catching an exception and then raising another one, you can chain them to get better tracebacks. Use raise MyException from OriginalException.

L67 Mutable defaults 1% of all issues

When function's argument has a mutable default, it may lead to bugs because defaults are shared across all invocations of function. So def fn(a: list = []) is a bad practice and should be replaced, for example, with def fn(a: list | None = None): a = a or []

L10 Modify in-place and return 1% of all issues

There are functions that receive input and return output. There are functions that receive some object and modify it in-place. For example, reversed() vs list.reverse(), dataframe.apply() vs dataframe.apply(inplace=True). Doing both at the same time is confusing: when function returns something, it's not expected that it has any side effect, but here it does.

L25 Hardcoding interpreter 1% of all issues

It is usually not a good practice to hardcode specific python interpreter at specific path: #!/usr/bin/python. Instead, use #!/usr/bin/env python, which will interpret $PATH and make your script more portable.

L47 Not all cases covered 1% of all issues

Python doesn't check whether you inspected all possible values of some collection. If a variable can become "one" or "two", and you check if var == "one": ... elif var == "two": ... then this code will break when one day you allow variable to be "three". You should always check whether the variable holds something unexpected, and in such case an exception should be raised: else: raise ValueError(f'Unexpected value: {var}').

L49 Not using super() 1% of all issues

Calling super() is a well-readable way to show that you want to call super method. If you want to call direct parent's method, just use super().method(args). Isn't it beautiful?

L54 Not using `startswith()` 1% of all issues

Very often developers use beginning in value instead of value.startswith(beginning). For example, checking 'google.com' in url may become true if url is https://evil.com?q=google.com. This may lead to vulnerabilities, sometimes severe. If you want to check that some string starts with some value, use some_string.startswith(some_value).

L57 Not exiting with error 1% of all issues

If your program is a CLI and it fails, then consider exiting with non-zero code, for example exit(1).

L60 Not closing file 1% of all issues

Once you open a file, you should close it afterwards. In cpython not closing a file is relatively safe, in pypy it's not, and in general one should not rely on python implementation details.

L64 Sleeping with hope 1% of all issues

Using sleep() and hoping that something will happen within sleep duration (for example, web page loading) is a very bad practice, because nothing guarantees you that. The assumption that some event happens within some time is very weak. Instead of using some duration number, you should strictly rely on mechanisms that report when the page is loaded. As a simple example, use while True with delays and timeout, and at each iteration check that the page was loaded.

L68 Typo 1% of all issues

People make typos. Here it is.

L69 Bad return type 1% of all issues

In python, when function fails to do something it's a common practice to either raise an exception or return None. Returning empty strings, False or anything else is confusing. Additionally it's nice that 1) python's return is the same as return None and 2) when execution flow reaches end of function and sees no "return" statement, it invokes return None automatically.

L11 Case not covered 1% of all issues

Some condition not taken into account

L24 Not using /usr/bin/env 1% of all issues

It is a good practice to not force using specific python from specific path #!/usr/bin/python, and instead make #!/usr/bin/env python which will interpret $PATH and make your scripts more portable.

L45 Not using zip() 1% of all issues

zip() function is a nice way to process pairs, triples etc of elements from different collection. If collections are of different sizes, you may need zip_longest.





Optimization issues

O15 Using print() 4% of all issues

print() is a nice way to output to stdout. But one day you'll need to not only write to stdout, but also, say, to a file. Another day you'll need to output only severe errors' messages, and nothing else. This all could be solved if using logging module. Usually it's as easy as from logging import getLogger; log = getLogger(__name__).

O23 Calculated multiple times 2% of all issues

Calculating something multiple times is redundant - use variables to store result of some calculation.

O5 Not using list comprehension 1% of all issues

List comprehensions are faster than .append() or .extend(). When you use a loop for filling in a list, it's a high chance that list comprehension will do a better job. You may also use asterisk (*) to generate list from single and multiple elements: all_items = [*items_group1, *items_group2, single_element]

O16 While True 1% of all issues

Developers always write code without bugs. So there's no chance of writing a code that will never exit infinite loop, right? However, if it happens anyway and your program ends up in an infinite loop, it won't just break and raise an exception. It will get stuck, become unresponsive, and maybe will eat all memory if you allocate something inside the loop. Very dangerous, but easy to avoid - use for loops with iteration limit: for _ in range(10_000): <your code> else: raise LoopError('Ran code 10k times and didn't break from loop, something went terribly wrong')

O3 Not using "inplace" in Pandas 1% of all issues

Many pandas dataframe methods have inplace param which modifies the dataframe itself and does not create a new one. This is helpful if you want to avoid redundant creation of new dataframe and removing an old one, which happens when assigning to original variable: df = df.apply(...) -> df.apply(..., inplace=True).

O8 Missing @wraps 1% of all issues

When writing decorators, it is a good practice to decorate them with @wraps(decorated_function) which does some nice things you usually don't think about when decorating: preserving name and docstring of the decorated function.

O30 Not using sets 1% of all issues

Searching in list is slow - O(n). Searching in set is fast - O(1). Also, lists are mutable while sets are not, so if you want to indicate and ensure that some collection never changes - use sets. So [1, 2, 3] -> {1, 2, 3}.

O32 Not using requests.Session 1% of all issues

Each call of requests.get() (and other methods) creates a new connection to web resource. It is okay if you make a single request, but is very non-optimal when making subsequent requests. You should always use sessions which will hold connection between requests: session = requests.Session(); response = session.get(url)

O51 Not using `maxsplit` 1% of all issues

Sometimes you need to split string and take only first or last element of the split. By default, str.split() does as much splits as possible, but you can limit it with maxsplit parameter. Say, you have a string address = 'domain.com/a/b/c/d'. address.split('/') will result in ['domain.com', 'a', 'b', 'c', 'd'], while address.split('/', maxsplit=1) will do only minimal required job and result in ['domain.com', 'a/b/c/d'].

O53 Not using generators 1% of all issues

Generators allow you to memory-efficiently iterate over collection of items because you don't need to store all of them in memory at once. Also, if you break at some iteration, you won't even spend memory and CPU time on processing unneeded items.

O55 Ask forgiveness, not permission 1% of all issues

Idiomatic python way is to "do something, and if it fails - catch an exception". Sometimes developers do it other way round: "ask if something is possible, then do it". This adds an overhead for that "asking" part. For example, checking some key in dictionary and (if it's present) access it requires two operations, while accessing key without check is one operation. Of course, you can't always follow this pattern and it depends on the situation. Read more about this on Google.





Readability issues

R1 Missing type hints 10% of all issues

Type hints help humans and linters (like mypy) to understand what to expect "in" and "out" for a function. Not only it serves as a documentation for others (and you after some time, when the code is wiped from your "brain cache"), but also allows using automated tools to find type errors.

R46 Not using pathlib 3% of all issues

Python's built-in pathlib library is nice! It's less verbose than os.path and is easier to read, for example: root = Path('root'); subdir = root / 'subfolder1' / 'subfolder2'; subdir.mkdir(exist_ok=True). Use it!

R14 Not using "early quit" 3% of all issues

"Early quit" is a common pattern which reduces branching and indents in python code. Instead of writing if a: <1000 lines of code> else: <return / raise exception>, revert the condition: if not a: <return / raise exception>. This way the "else" clause is not needed, and program logic is straightforward: if something is wrong, quit or raise, otherwise go on.

R41 Too many lines of code 3% of all issues

Functions should be small - at least fit your screen's height. Otherwise they will be hard to read and hard to test. Try splitting big function into smaller ones.

R6 Copy-paste 2% of all issues

Copy-paste lead to errors very, very often because developers forget to change value on one of copy-pasted lines . You should avoid it as much as possible. Usually a good solution is to extract the things that differ into separate variables.

R28 Not using dataclass 2% of all issues

Dataclasses let you get rid of many boilerplate code, most often the "init hell": def __init__(self, a): self.a = a. With dataclasses, it's all done automatically!

R4 Range-based iteration 1% of all issues

Using len and range in python's for loop smells. Idiomatic python iteration looks like for element in collection. If you need element's index as well, use for i, element in enumerate(collection).

R61 Not using f-strings 1% of all issues

F-strings are powerful and very easy to read. Compare: 'text ' + str(number) + ' text' vs f'text {number} text'

R63 Creation and population 1% of all issues

There is no need to create something and then modify it if you can create that something already with modifications. For example, items = []; items.append(1) -> items = [1]

R21 Bad function name 1% of all issues

By looking at function name, stranger should roughly understand what the function does. If he doesn't, then the name should be changed.

R40 Overwriting builtins 1% of all issues

Never overwrite built-in names, like list, sum or max - if later someone wants to use built-in function, he will find it replaced with your variable. If you really need to use one of those names, append underscore: sum_ = 15.

R7 Redundant "else" 1% of all issues

There is a well-known pattern "if a: return one else: return two". In fact, the else keyword is redundant, because when first if condition evaluates to True, the function returns, otherwise control flow goes on. Removing else makes the code cleaner and easier to read.

R19 Bad variable name 1% of all issues

Simple rule: read a variable name as if you were English-speaking person without a context and guess what's inside - and if you have no clue, then you should give it a better name. Exceptions: i, j, x, y etc.

R37 Not using response.raise_for_status() 1% of all issues

requests module has built-in method to raise exception on bad response: response.raise_for_status(). If you need to perform some additional actions on bad response, use if not response.ok: ...

R27 Comparing with True / False 1% of all issues

if a == True should be replaced with if a is True, or better just if a. if a == False should be replaced with if a is False, or better just if not a. But be careful with cases when a may be None or is an object which has special rules of conversion to boolean.

R34 Should be classmethod 1% of all issues

Some methods within a class don't depend on self variable. This is an indication that such methods should be classmethods. Use the @classmethod decorator.

R48 Magic variable 1% of all issues

Magic variables (usually numeric values without explanation why the values are exactly like that) are evil. One should make them class / module variables or put in some config / env files with explanation why these values were chosen.

R2 Too many parameters 1% of all issues

When function has too many parameters, it is very hard to read and also it is very easy to accidentally mix them and instead of calling fn(.., o, p, q) call fn(..., o, q, p). Solution is to either refactor the function to receive smaller amount of data (maybe function is too big and can be divided into smaller functions?), or to send one parameter which is some data structure and incorporates all the parameters (this structure could be NamedTuple, dataclass, or TypedDict).

R13 TypeVar not used 1% of all issues

When you want to show that the same type is used in different places - use TypeVar instead of Any.

R18 Defined too far from usage 1% of all issues

It is a good practice to create new variables as close as possible to the place where they are used. This minimizes cognitive load: fewer things to remember.

R35 Setattr / getattr 1% of all issues

Setattr and getattr are helpful when you calculate some attribute names dynamically. However, you later, your IDE, and other devs may have troubles understanding which attributes you accessed. Always try to avoid using setattr and getattr. Dictionaries are usually a good replacement.

R36 Not using keys() / values() / items() 1% of all issues

dict type has built-in methods: keys() - to get keys iterator; values() - to get values iterator; items() - to get (key, value) iterator. Use it.

R58 Super() with redundant args 1% of all issues

Calling super() doesn't need any arguments in most cases. super(Class, self) -> super().

R59 Not combining context managers 1% of all issues

Context managers may be combined: with open('file1') as file1, open('file2') as file2:

R62 Use class-based views 1% of all issues

Almost always class-based views are more readable and extendable than plain "function" views. As an example, CBVs allow you to handle GET, POST and other verbs in separate methods, which adds readability to your code.

R65 Not using `suppress` 1% of all issues

Use contextlib.suppress to ignore several types of exceptions: try: ... except ValueError: pass -> with suppress(ValueError): ....

R66 Inconsistent code 1% of all issues

If you start writing in one manner, you should continue doing so till the end. Otherwise it will be harder to read your code.

R70 Not using timedelta 1% of all issues

timedelta is a python built-in data structure which allows defining time periods in a human-readable way (timedelta(hours=5)) and converting to seconds. Developers often define time periods in ints like TIMEOUT = 5 * 60 which does not reveal which unit it is, and is hard to read. Use timedelta.

R71 Using map / filter + lambda 1% of all issues

map() and filter() functions came to python by accident and have rare use-case. Usually list comprehension / generator is more readable and faster. list(map(int, filter(lambda x: x > 5, items))) -> [int(x) for x in items if x > 5].