00:00 - 00:04
today we're going to be designing a
00:02 - 00:06
popular system design interview problem
00:04 - 00:08
design a web hook service this is a
00:06 - 00:10
common question asked in interviews at
00:08 - 00:12
Big tech companies like Google and meta
00:10 - 00:15
so let's go through how to Ace this
00:12 - 00:17
interview first let's clarify what a web
00:15 - 00:20
Hook is a web Hook is a communication
00:17 - 00:22
method that allows one Software System
00:20 - 00:25
to automatically send real-time data or
00:22 - 00:26
notifications to another system when a
00:25 - 00:29
specific event
00:26 - 00:31
occurs unlike traditional apis which
00:29 - 00:34
require constant polling to check for
00:31 - 00:36
updates web hooks push the information
00:34 - 00:39
as soon as the event happens this makes
00:36 - 00:42
web hooks both efficient and real
00:39 - 00:45
time now that we know what a web Hook is
00:42 - 00:46
let's discuss what a web hook service is
00:45 - 00:49
a web hook service is a system that
00:46 - 00:51
handles these incoming notifications or
00:49 - 00:53
web hooks its job is to manage the
00:51 - 00:55
entire life cycle of a web hook
00:53 - 00:57
receiving the event processing it
00:55 - 01:00
performing necessary operations and
00:57 - 01:02
storing the results
01:00 - 01:04
to build an effective web hook service
01:02 - 01:07
you need to meet several functional
01:04 - 01:09
requirements first the service must
01:07 - 01:11
accept external API calls allowing it to
01:09 - 01:14
receive events like a payment being
01:11 - 01:16
processed or an order being shipped next
01:14 - 01:19
it needs to execute the corresponding
01:16 - 01:20
operations for each event for example if
01:19 - 01:23
a payment is successful the service
01:20 - 01:26
should update the order status in the
01:23 - 01:28
database finally it must persist both
01:26 - 01:31
the original event data and the results
01:28 - 01:33
of the operations for future reference
01:31 - 01:34
this is crucial for tracking auditing
01:34 - 01:38
debugging Beyond functionality the web
01:37 - 01:42
hook service also needs to be reliable
01:38 - 01:44
and robust Additionally the service must
01:42 - 01:46
provide an at least once delivery
01:44 - 01:48
guarantee this means every event must be
01:46 - 01:52
processed at least once even if the
01:48 - 01:53
system encounters failures however this
01:52 - 01:56
also introduces the possibility of
01:53 - 02:00
duplicate event processing which brings
01:56 - 02:02
us to the next point idempotency
02:00 - 02:05
operations performed by the web hook
02:02 - 02:07
service must be
02:05 - 02:09
idempotent this ensures that even if the
02:07 - 02:12
same event is processed multiple times
02:09 - 02:14
the outcome remains
02:12 - 02:17
consistent let's start with a basic
02:14 - 02:19
design when an external system sends an
02:17 - 02:21
event via an HTTP request the web hook
02:19 - 02:25
service needs a request Handler to
02:21 - 02:27
receive and process the event this data
02:25 - 02:30
is then immediately saved into a
02:27 - 02:31
database while straightforward the this
02:30 - 02:33
design has a
02:31 - 02:36
flaw the request handlers handle the
02:33 - 02:38
HTTP requests as well as the business
02:36 - 02:41
logic of processing and persisting the
02:38 - 02:43
events if the request Handler fails
02:41 - 02:46
after processing the event but before
02:43 - 02:48
saving it the event could be lost to
02:46 - 02:50
address this we can introduce a message
02:48 - 02:53
cue between the request Handler and the
02:50 - 02:55
database we reduce the responsibility of
02:53 - 02:58
the request Handler to handle initial
02:55 - 02:59
HTTP requests and let the message CU
02:58 - 03:02
consumer do the heavy lifting of
02:59 - 03:04
processing the actual event the message
03:02 - 03:06
CU temporarily holds events ensuring
03:04 - 03:09
that no data is lost even if the system
03:06 - 03:11
experiences issues the request Handler
03:09 - 03:14
now focuses on handling HTTP requests
03:11 - 03:16
and enqing messages while separate
03:14 - 03:18
consumers process these events from the
03:16 - 03:20
queue and save them to the
03:18 - 03:24
database this design offers several
03:20 - 03:25
benefits including failure recovery load
03:24 - 03:27
buffering and
03:25 - 03:30
scalability let's break down how these
03:27 - 03:32
components interact in a real world
03:30 - 03:34
scenario by the way if you're enjoying
03:32 - 03:36
this video and want the best practice
03:34 - 03:38
website for system design interviews
03:36 - 03:41
check out our website at system designs
03:38 - 03:43
school. now back to the video the
03:41 - 03:45
process begins when an external system
03:43 - 03:48
triggers an event such as a payment
03:45 - 03:50
confirmation the client server sends
03:48 - 03:53
this event to the web hook service via
03:50 - 03:55
an API call to the request Handler the
03:53 - 03:58
request Handler validates the event and
03:55 - 04:01
enues it into the message
03:58 - 04:03
CU once inced the event receiver sends a
04:01 - 04:06
success response back to the client
04:03 - 04:08
server Q consumers then fetch events
04:06 - 04:09
from the queue process them and store
04:08 - 04:12
the results in the
04:09 - 04:15
database after successful processing the
04:12 - 04:17
event is dced marking the completion of
04:15 - 04:20
the process notice that the request
04:17 - 04:22
Handler returns 200 as soon as the NQ
04:20 - 04:25
operation is successful this is to
04:22 - 04:27
confirm we received the event the
04:25 - 04:30
service level agreement in requirement
04:27 - 04:32
is to eventually process the event this
04:30 - 04:34
means our system must be resilient
04:32 - 04:36
designed to keep functioning even if
04:34 - 04:39
certain components fail along the
04:36 - 04:41
way let's explore how to maintain system
04:39 - 04:43
functionality in the face of failures
04:41 - 04:46
this is an essential component in system
04:43 - 04:48
design let's start with the request
04:46 - 04:51
Handler if the request Handler fails
04:48 - 04:54
after receiving an event but before
04:51 - 04:57
incing it the client service will not
04:54 - 05:00
get the HTTP 200 response this way the
04:57 - 05:03
client knows to retry if necessary
05:00 - 05:04
for message CU failures use durable cues
05:03 - 05:07
that persist messages to dis and
05:04 - 05:09
replicate across multiple nodes this
05:07 - 05:12
ensures that events aren't lost even if
05:09 - 05:15
the Q server crashes if a queue consumer
05:12 - 05:17
fails multiple instances can take over
05:15 - 05:19
ensuring continuous
05:17 - 05:21
operation message acknowledgement and
05:19 - 05:24
deqing the message should only happen
05:21 - 05:27
after successful processing and database
05:24 - 05:29
storage finally comprehensive monitoring
05:27 - 05:32
alerts and Chaos engineering can help
05:29 - 05:33
identify and address potential issues
05:32 - 05:35
before they impact the overall
05:35 - 05:41
system now let's dive into how to handle
05:38 - 05:42
duplicate events duplicate events can
05:41 - 05:45
occur if the external system sends the
05:42 - 05:47
same event multiple times the web hook
05:45 - 05:49
service needs to recognize and handle
05:47 - 05:52
these duplicates to ensure each event is
05:49 - 05:54
processed only once and to ensure
05:52 - 05:56
idempotency this can be done by
05:54 - 05:58
duplicating events in the message queue
05:56 - 06:00
or by checking the database before
05:58 - 06:02
processing an event
06:00 - 06:05
for example Amazon sqs has message D
06:02 - 06:08
duplication support we can use event ID
06:05 - 06:10
as the D duplication ID so that the
06:08 - 06:12
queue can filter D duplicates out
06:10 - 06:14
automatically this offloads the work to
06:12 - 06:16
the que and simplifies our design
06:14 - 06:18
however D duplication is typically
06:16 - 06:21
supported in fifo Q's which is slower
06:18 - 06:23
than standard cues if we want to use the
06:21 - 06:25
more performant standard Q we can use a
06:23 - 06:28
simple in-memory database lookup with a
06:25 - 06:30
TTL to implement D duplication finally
06:28 - 06:31
let's talk about security
06:30 - 06:33
since the web Hook is open to the
06:31 - 06:35
internet traffic it's important to
06:33 - 06:37
verify the request is indeed sent by The
06:35 - 06:39
Trusted service the most common method
06:37 - 06:41
for securing web hooks is by using a
06:39 - 06:43
hash based signature when you register
06:41 - 06:46
your web hook service with stripe a
06:43 - 06:47
shared secret key is generated every
06:46 - 06:49
time stripe sends a message it
06:47 - 06:52
calculates the signature using the
06:49 - 06:55
secret key and the message body this
06:52 - 06:57
signature is then included in the HTTP
06:55 - 06:59
header when your service receives the
06:57 - 07:02
message it performs the same calcul
06:59 - 07:03
ation using its copy of the secret key
07:02 - 07:06
and compares the result with the
07:03 - 07:09
signature in the header if they match
07:06 - 07:11
you know the request is authentic thanks
07:09 - 07:12
so much for watching this video if you
07:11 - 07:14
liked it and want to see more make sure
07:12 - 07:17
to subscribe to our Channel and hit the
07:14 - 07:19
notification Bell to stay up to date
07:17 - 07:21
also check out our website at
07:19 - 07:23
systemdesign school. for systemdesign
07:21 - 07:26
interview practice complete with an AI
07:23 - 07:29
autog grader thanks and we'll see you
07:26 - 07:29
all in the next video