Kindle Notes & Highlights
Read between
March 20 - March 27, 2021
sometimes we do not know the number or type of these fields in advance. This calls for the form to be dynamically generated. This pattern is sometimes called dynamic form or runtime form generation.
class PersonDetailsForm(forms.Form): name = forms.CharField(max_length=100) age = forms.IntegerField() def __init__(self, *args, **kwargs): upgrade = kwargs.pop("upgrade", False) super().__init__(*args, **kwargs) # Show first class option? if upgrade: self.fields["first_class"] = forms.BooleanField( label="Fly First Class?") Now, we just need to pass the PersonDetailsForm(upgrade=True) keyword argument to make an additional Boolean input field (a checkbox) appear. A newly introduced keyword argument has to be removed
...more
This is a fairly straightforward approach, with each form specifying a different view as its action. For example, take the subscribe and unsubscribe forms. There can be two separate view classes to handle just the POST method from their respective forms.
CreateView
DetailView
UpdateView
DeleteView
Race condition
Starvation
Deadlock
Debugging challenge:
Order preservation
callback pattern In this pattern, when a caller calls a service, it specifies an endpoint to be called when the operation is completed. This is similar to specifying callbacks in some programming languages like JavaScript. When used purely as an HTTP callback, it is called a WebHook
Polling pattern Polling, as the name suggests, involves the client periodically checking a service for any new events. This is often the least desirable means of asynchronous communication as polling increases system utilization and becomes difficult to scale.
A task can also be scheduled to run periodically using what Celery calls a Celery beat process. You can configure it to kick off tasks at certain time intervals, such as every 10 seconds or at the start of a day of the week. This is great for maintenance jobs such as backups or polling the health of a web service.
In Celery, you can choose to retry automatically or manually. Celery makes it easy to fine-tune its automatic retry mechanism. In the following example, we specify multiple retry parameters: @shared_task(autoretry_for=(GatewayError,), retry_backoff=60, retry_kwargs={'max_retries': 5}, retry_jitter=True) def fetch_feed(feed_id):
In this case, it is just the GatewayError exception. You may also mention the exception base class here to autoretry_for all exceptions.
waiting longer and longer for a retry is called exponential backoff
thundering herds. If a large number of tasks have the same retry pattern and request a resource at the same time,
a task that always places a fresh order when called is not idempotent, but a task that cancels an existing order is idempotent. Operations that only read the state of the world and do not have any side effects are nullipotent. As Celery architecture relies on tasks being idempotent, it is important to try to study all the side effects of a non-idempotent task and convert it into an idempotent task. You can do this by either checking whether the tasks have been executed previously (if it was, then abort)
Finally, call your task multiple times to test whether it leaves your system in the same state.
Now, if A and B reads the counter at the exact same time, they will overwrite each other's value by the end of the task, so the final value will be based on who writes in the end. In fact, the global counter's value will be highly dependent on the order in which the tasks are executed. Thus, race conditions result in invalid or corrupt data.
A practical solution will be to insert the status of each image into a table indexed with the unique identifier of an image like its hash value or file path: Image hash Competed at Matched image path SHA256: b4337bc45a8f... 2018-02-09T15:15:11+05:30 /celeb/7112.jpg SHA256:550cd6e1e8702... 2018-02-09T15:17:24+05:30 /celeb/3529.jpg You can find the total number of successful matches by counting rows in this table. Additionally, this approach allows you to break down the successful matches by date or time. The race conditions are avoided,
Row-level locks are done in Django by calling select_for_update() on your QuerySet within a transaction. Consider this example: with transaction.atomic(): feed = Feed.objects.select_for_update().get(id=id) feed.html = sanitize(feed.html) feed.save() By using select_for_update, we lock the Feed object's row until the transaction is done.
it is better to use F() expressions to avoid a race condition. F() expressions avoid the need to pull the value from the database to Python memory and back. Consider the following instance: from django.db.models import F feed = Feed.objects.get(id=id) feed.subscribers = F('subscribers') + 1 feed.save() It is only when the save() operation is performed that the increment operation is converted to an SQL expression and executed within the database.
It is easy to forget that each time we call a Celery task, the arguments get serialized before it enters the queue. Hence, it is not advisable to send a Django ORM object or any large object that might clog up the queues.
threads need to be synchronized while accessing shared resources,
coroutine has well-defined places where execution is handed over. As a result, you can make changes to a shared state as long as you leave it in a known state.
If you can run a maximum of hundreds of threads, you might be able to run tens of thousands of coroutines, given the same memory.
The downsides of coroutines is that you cannot mix blocking and non-blocking code.
import asyncio import aiohttp from time import time sites = [ "http://news.ycombinator.com/", "https://www.yahoo.com/", "http://www.aliexpress.com/", "http://deelay.me/5000/http://deelay.me/", ] async def find_size(session, url): async with session.get(url) as response: page = await response.read() return len(page) async def show_size(session, url): size = await find_size(session, url) print("Read {:8d} chars from {}".format(size, url)) async def main(loop): async with aiohttp.ClientSession() as session: tasks = []
...more
You can consider faster implementations such as uvloop to speed things up further.
Concurrency will help you avoid blocking the processor core while waiting for, say, I/O events, while parallelism will help to distribute work among all the available cores. In both cases, you are not executing synchronously, that is, waiting for a task to finish before moving on to another task. Asynchronous systems might seem to be the most optimal; however, they are harder to build and reason about.
Django Channels was originally created to solve the problem of handling asynchronous communication protocols, such as WebSockets,
Channels, currently implemented with a Redis backend, provide an at best one-off guarantee, while Celery provides an at least one-off guarantee. This essentially means that Celery will retry when a delivery fails until it receives a successful acknowledgment. In the case of Channels, it is pretty much fire-and-forget. Secondly, Channels does not provide information on the status of a task out of the box. We need to build such functionality ourselves, for instance by updating the database. Celery tasks status can be queried and persisted.
URI versioning:
Query string versioning
Custom header versioning
Media type versioning
The API looks great, except for the security risk of exposing the user model's primary key publicly. Thankfully, the serializers can be changed to add fields that are not present in the model, as the following code demonstrates: class PostSerializer(serializers.ModelSerializer): posted_by = serializers.SerializerMethodField() def get_posted_by(self, obj): return obj.posted_by.username class Meta: model = models.Post fields = ("posted_by", "message",) The SerializerMethodField is a read-only field that gets its value from a class method.
Ensure that you cover essential topics such as how to obtain authentication tokens or pricing, as well as high-level concepts. Good API documentation is crucial for its adoption and can even overcome a poorly designed API,
Traditionally, a page containing a lot of data was paginated to reduce page loading time and thereby improve the user experience. Then, Asynchronous JavaScript And XML (AJAX) technologies gave browsers the ability to asynchronously load content. Thus, the Infinite Scrolling design pattern was born,
As a best practice, it is advisable to create a completely repeatable environment for a project. This includes having a requirements file with all transitive dependencies listed, pinning, and with --hash digests. --hash digests of the packages look like this: Django==1.5.9 --hash=sha256:2cf24dba5fb0a30e26e83b2ac5...
the best guide to understanding the application's functionality is the executable test cases and the code itself.
Code should be seen as a liability and not as an asset. As counter-intuitive as it might sound, if you can achieve your business goals with a smaller amount of code, you have dramatically increased your productivity. Having less code to test, debug, and maintain can not only reduce ongoing costs, but also make your organization more agile and flexible to change.
Irrespective of whether you are adding features or trimming your code, you must not touch your working legacy code without tests in place.