STOCK EXCHANGE SYSTEM DESIGN | AMAZON INTERVIEW QUESTION DESIGN STOCK EXCHANGE
摘要
TLDRDas Video erläutert detailliert die Softwarearchitektur für ein Börsensystem, das sowohl Aktien- als auch Kryptowährungs- und Forex-Handel unterstützt. Es beschreibt die notwendigen Systemanforderungen, einschließlich der Fähigkeit, Kauf- und Verkaufsorders in Echtzeit zu verarbeiten und Risiken zu managen. Der Kern des Systems ist der Matching-Engine, der mit Datenstrukturen wie Listen, Linked Lists oder Heaps arbeitet, um Orders effizient zu matchen. Das Systemdesign diskutiert außerdem verschiedene Komponenten wie API-Gateways, Risiko-Managementsysteme und Datenbankspeicherung, um hohe Verfügbarkeit und schnelle Latenzzeit zu gewährleisten. Eine Vorgehensweise zur Sicherung der Systemzuverlässigkeit, einschließlich der Verwendung eines Zookeepers für den aktiven/passiven Status von Matching-Engines, wird ebenfalls beschrieben. Schließlich erklärt das Video, wie Echtzeit-Updates den Nutzern und Hochfrequenz-Händlern übermittelt werden.
心得
- 📈 Börsensystem für Aktien, Krypto und Forex
- ⚙️ Matching-Engine zur Order-Abwicklung
- 🕒 Echtzeit-Preisinformationen erforderlich
- 📉 Risikoanalyse vor Orderausführung
- 💡 Nutzung von Listen oder Heaps im Engine
- 🌐 API-Gateways für Anfragenmanagement
- 🔄 Hohe Verfügbarkeit und Skalierbarkeit
- 🗂️ Einsatz von Zookeeper für zuverlässigen Betrieb
- 📊 Echtzeit-Updates für Nutzer und HFT
- 💾 Speicherung im Speicher für Effizienz
时间轴
- 00:00:00 - 00:05:00
Marina stellt Softwarearchitektur und Systemdesign für den Börsenhandel vor. Das gleiche Design gilt auch für Kryptowährungen und den Forexhandel. Die Anforderungen umfassen Kauf- und Verkaufsfunktionen, Echtzeit-Preisaktualisierungen, Bestandsverwaltung und Risikomanagement. Ziele sind die Verarbeitung von 0,1 Millionen Bestellungen pro Sekunde, niedrige Latenz und hohe Verfügbarkeit.
- 00:05:00 - 00:10:00
Das Herzstück des Systems ist die Matching-Engine, die Kauf- und Verkaufsaufträge zusammenbringt. Diese Engine analysiert eingehende Kauf- und Verkaufsaufträge, sortiert sie nach bestimmten Kriterien und führt sie aus, indem sie die besten Preis- und Mengenübereinstimmungen findet. Datenstrukturen wie Arrays, Listen, Heaps werden verwendet, um die Effizienz zu optimieren.
- 00:10:00 - 00:15:00
Einzuhaltende Datenstrukturen sind notwendig, um Operationen effizient durchzuführen. Es ist wichtig, dass alle Daten im Speicher verarbeitet werden, um die Geschwindigkeit zu maximieren. Seriellisierbare Strukturen sind erforderlich, um einen Systemzustand beizubehalten, falls es zu einem Absturz kommt. Diese sichern die Fortsetzung von Prozessen nach einem Ausfall.
- 00:15:00 - 00:20:00
Systemdesign-Diagramm erklärt den Datenfluss vom Benutzer über API-Gateways, zu Risikomanagement, Routern und der Matching-Engine. Einzigartige IDs werden zur Auftragsverfolgung verwendet. Risikomanagementsysteme prüfen die Sicherheit von Aufträgen basierend auf festgelegten Parametern. Aufträge werden anhand ihrer Art bestimmten Warteschlangen zugewiesen, um die Verarbeitung zu optimieren.
- 00:20:00 - 00:25:00
Das Matching erfolgt in einem parallelen Setup, bei dem jede Warteschlange von einem separaten virtuellen Maschine bedient wird, die eine Instanz der Matching-Engine betreibt. Active und Passive Matching Engines sorgen für High Availability durch parallele Verarbeitung und Übernahme im Fehlerfall. Zookeeper koordiniert zwischen den Instanzen.
- 00:25:00 - 00:35:02
Sobald eine Übereinstimmung gefunden wurde, sendet die Matching-Engine Daten zur Auftragsabwicklung und Preisänderung an Server, die die Datenbanken und Benutzeroberflächen aktualisieren. Der Zahlungsverkehr wird integriert, um die Konten der Benutzer bei Transaktionen zu belasten oder zu gutschreiben. Es gibt Maßnahmen, um die Datenspeicherung zu gewährleisten und das System bei Abstürzen wiederherzustellen.
思维导图
视频问答
Welche Märkte kann das beschriebene System abdecken?
Das System kann für Aktienmärkte, Kryptowährungen und Forex-Handel verwendet werden.
Was ist die Hauptfunktion des Matching-Engines?
Der Matching-Engine gleicht Kauf- und Verkaufsaufträge effizient ab.
Welche Eigenschaften muss der Matching-Engine haben?
Er muss in der Lage sein, Aufträge hinzuzufügen, zu löschen, aufzuteilen und effektiv in O(1) oder log(n) Zeit abzugleichen.
Warum muss der Matching-Engine seine Operationen im Speicher durchführen?
Damit der Engine effizient arbeitet und tausende Orders pro Sekunde verarbeiten kann, muss er Prozesse im Speicher ausführen.
Welche Ansätze gibt es zur Sicherstellung der Systemzuverlässigkeit?
Verwendung von aktiven und passiven Matching-Engines sowie regelmäßiges Speichern des Systemstatus als Backup.
查看更多视频摘要
Das erste Trimester der Schwangerschaft: das erwartet dich in SSW 1-12
Rückwärts seitlich rechts einparken | Grundfahraufgabe Klasse C | LKW Führerschein | FS Strothmann
Rückwärts quer o. schräg einparken | Grundfahraufgabe Klasse C | LKW Führerschein | FS Strothmann
Studie: ChatGPT schlägt ÄRZTE um 16%?!
Rückwärts Rechts in eine Einmündung | Grundfahraufgabe Klasse C | LKW Führerschein | FS Strothmann
LKW-Ausbildung - Verbinden und Trennen - Links um die Ecke - Sicheres Einsteigen
- 00:00:00hello everyone my name is marina and
- 00:00:01indecision let's understand software
- 00:00:03architecture or system design for stock
- 00:00:05exchange and this system design is also
- 00:00:08holds good for crypto currency or forex
- 00:00:11trading as well
- 00:00:13thanks a lot for all the support until
- 00:00:15now and in future as well if you like
- 00:00:19the content on this channel and if you'd
- 00:00:20like to share or buy a cup of coffee for
- 00:00:24me you can do so by joining to the
- 00:00:26channel
- 00:00:29in written I have custom emojis for
- 00:00:32Prada Me's and also I'm gonna feature
- 00:00:35your name in the end of the videos which
- 00:00:38I am going to upload in future and
- 00:00:40thanks a lot for the support we can use
- 00:00:43this system wherever we have to match
- 00:00:45the supply to the demand or vice versa
- 00:00:47so here are the requirements for our
- 00:00:50system and this is the goals of our
- 00:00:52system design so the requirements which
- 00:00:54we want to cover in this system design
- 00:00:56is users should be able to buy and sell
- 00:00:59stocks on this platform the second one
- 00:01:01is real-time price update information
- 00:01:03because we need to keep on sending the
- 00:01:06real-time price information to the users
- 00:01:08for the fact that users will be looking
- 00:01:11at the price fluctuation and then he is
- 00:01:13going to place an order to buy our set
- 00:01:16so it has to be real-time and the third
- 00:01:18one is we should have basic
- 00:01:19functionality like look at all the
- 00:01:21stocks which he owns and also the
- 00:01:23ordering in history or information and
- 00:01:25the fourth one is risk management we
- 00:01:27have to have our system which analyzes
- 00:01:29the risk before executing the order
- 00:01:31matching and what are the goals of our
- 00:01:34system so the first one is we have to
- 00:01:36process at least you know 0.1 million
- 00:01:39orders for a second
- 00:01:41second one is thousands of stocks are
- 00:01:44registered companies are in the platform
- 00:01:46so we have to process orders for all of
- 00:01:48them with low latency and the third one
- 00:01:51is geo-specific
- 00:01:52we don't have to design the system to
- 00:01:54cater users from all over the world
- 00:01:56because usually the exchanges are
- 00:01:58specific to our country even though
- 00:02:00there are users from other outside the
- 00:02:02country it's fine
- 00:02:03our goal here is to design the system
- 00:02:06specifically for users in a specific geo
- 00:02:08and the fourth one is low latency we
- 00:02:10have to process this art of matching as
- 00:02:13fast as possible it could be in an
- 00:02:15around 300 milliseconds something like
- 00:02:18that and the fifth one is high
- 00:02:19availability or scalability where in
- 00:02:21this system should be always available
- 00:02:23for users to place orders to buy or sell
- 00:02:26and also the system should be able to
- 00:02:28add more companies or stocks into the
- 00:02:31platform and keep on scaling linearly
- 00:02:34first let's design the core component of
- 00:02:37our system that is matching engine but
- 00:02:40before understanding the match
- 00:02:42engine you need to understand what
- 00:02:43exactly stock exchange does a stock
- 00:02:46exchange is a place which is usually a
- 00:02:49regulated place where people come
- 00:02:51together and then they have something to
- 00:02:54buy and something to sell and the
- 00:02:56matching engine is going to match those
- 00:02:58two together and how does it look like
- 00:03:00it usually looks like this if you see
- 00:03:03this image is for the reliance stock
- 00:03:06which has these many orders to be debate
- 00:03:10or buy and the right side shows the
- 00:03:14orders to be sold and this one is the
- 00:03:17screen shot taken from the crypto
- 00:03:18currency exchange where people want to
- 00:03:20trade bitcoins and it all boils down to
- 00:03:23this one called as order book which
- 00:03:25usually looks like this where in this
- 00:03:29example we can see that there are four
- 00:03:31columns the buy quantity is the number
- 00:03:34of shares waiting to be born and the buy
- 00:03:37price is the price which a buyer is
- 00:03:39willing to pay for this particular shape
- 00:03:42and the set price is on the right side
- 00:03:45is the price which a seller is quoting
- 00:03:48for a share the set quantity is the
- 00:03:51number of shares you want to sell so in
- 00:03:54total we have around 500 shares waiting
- 00:03:56to be bought at a rate of 5,000 250 and
- 00:04:00we have thousand six 50 shares waiting
- 00:04:02to be sold at five thousand two hundred
- 00:04:04and fifty two so now let's design our
- 00:04:08matching engine so the characteristic of
- 00:04:10our matching engine should be it should
- 00:04:12be able to add an order it should be
- 00:04:14able to cancel the order if I don't want
- 00:04:16to buy or sell and it should be able to
- 00:04:18split an order that means that if I want
- 00:04:20to buy ten stocks and if nobody is
- 00:04:23selling 10 stocks on a whole our
- 00:04:26matching engine should be able to buy in
- 00:04:29a split or whatever it can and it should
- 00:04:32be able to match obviously and the
- 00:04:34second one is it should be able to do
- 00:04:37all of these actions much efficiently
- 00:04:40ideally it is always good to do it in
- 00:04:42order of one are better in log n time
- 00:04:45and it should do everything in reliably
- 00:04:48so how does the matching engine work
- 00:04:51it's not really complicated it's very
- 00:04:53simple consider you have these orders to
- 00:04:56buy okay
- 00:04:57usually people buy try to buy the shares
- 00:05:02or the stocks or any commodities at
- 00:05:04cheap price so the winner is who wants
- 00:05:07to buy at the higher price so he's going
- 00:05:10to stand out right because anyone who is
- 00:05:13trying to sell something he doesn't want
- 00:05:14to sell cheap so usually the sellers as
- 00:05:17well won't try to sell it with higher
- 00:05:20price but the people who stands out in
- 00:05:23cell queue is one who wants to sell it
- 00:05:26cheaply so the buy order list will be
- 00:05:31sorted in the decreasing order where the
- 00:05:35buyer with higher price or the buyer who
- 00:05:38wants to buy a high price will stand out
- 00:05:40okay here in this case there are a
- 00:05:42couple of orders or people who wants to
- 00:05:44buy some stocks and here someone want to
- 00:05:47buy em an order with price of $10 and he
- 00:05:51wants to buy 5 stocks and here with
- 00:05:53price of $9 he wants to buy 10 this guy
- 00:05:56with the price of $8 he wants to buy 5
- 00:05:59stocks and in said cube there are a lot
- 00:06:02of orders here and we want to sort this
- 00:06:06in increasing order and this one in the
- 00:06:09decreasing order
- 00:06:10so here the one the guys who's who is
- 00:06:13are the orders which stands out is the
- 00:06:15one who wants to sell cheaper so in this
- 00:06:18case the orders are there are order with
- 00:06:21$8 and 5 quantity $9 and 5 20 and $9 by
- 00:06:26quantity $9 by quantity because no one
- 00:06:28wants to be reserved for cheap right
- 00:06:30because most likely the prices will go
- 00:06:32is same or it will be always higher so
- 00:06:36when we start it in the increasing order
- 00:06:38all the other orders which are selling
- 00:06:42at really high price will go in the end
- 00:06:44the orders who wants to sell it
- 00:06:46if the cheaper price will come in the
- 00:06:48front in this case they're the guys are
- 00:06:51the orders wants to buy at higher price
- 00:06:54will come in the starting of the list
- 00:06:57here so now how to match it it's very
- 00:07:00simple it's just like the array elements
- 00:07:02matching that's simple so what we have
- 00:07:05to do is take the first order from this
- 00:07:08this and then try to match from the
- 00:07:10starting of the cell list so $10 and 5
- 00:07:14so someone wants to buy five shares at
- 00:07:16$10 price but someone is also selling 5
- 00:07:20shares with $8 price now it's definitely
- 00:07:24a good match even we are getting for $2
- 00:07:27cheap now we can match this because the
- 00:07:29quantities are also matching that means
- 00:07:32that this and this can be matched so the
- 00:07:36the buyer who wanted to buy a $10 he is
- 00:07:39actually getting it only for $8 and
- 00:07:41these two are matched and we can execute
- 00:07:43this consider we executed so I'm gonna
- 00:07:46remove this from the list okay no those
- 00:07:51two orders are gone and who is standing
- 00:07:53out here so there is a guy who wants to
- 00:07:55buy at $9 and 10 stocks and someone
- 00:07:59selling some stocks with $9 price and
- 00:08:02high quantity now we can definitely
- 00:08:03match this and how do we match it is
- 00:08:06this one and this one get matched okay
- 00:08:08and the problem here is the quantity now
- 00:08:12he wants to buy 10 and here there is an
- 00:08:15order where the guy wants to sell only 5
- 00:08:18shares only but there are a couple more
- 00:08:21orders here where you can buy from but
- 00:08:23we can't do really match like that what
- 00:08:25do we do is we only purchase 5 starts
- 00:08:29from here okay and then we can either
- 00:08:33update this to 5 or other strategy is
- 00:08:38totally remove this and then we add it
- 00:08:41into this list so it will come back here
- 00:08:44with you know $9 and 5 why it will come
- 00:08:48back is every time when the new order
- 00:08:49comes into the list will always be
- 00:08:51sorting in the descending order and here
- 00:08:54we always sorted in the increasing order
- 00:08:56so the same order which was split comes
- 00:08:59back and sits in the front of this list
- 00:09:01and we can match it now now this and
- 00:09:03this will get matched and these two can
- 00:09:04be executed okay and these two are
- 00:09:06executed and similarly we can keep on
- 00:09:08executing and this list will keep on
- 00:09:10burning and as and when the new order
- 00:09:11comes in the list will also be inserted
- 00:09:14with the new orders like say someone
- 00:09:16wants to buy add say $10 20 shares and
- 00:09:20if there is something we can
- 00:09:21this match now $10 $9 $1 chip you can
- 00:09:24keep matching and then executing so what
- 00:09:27are the data structure we can use to do
- 00:09:29with this the first one is you can use
- 00:09:31lists and do this the second one is you
- 00:09:35can use linked lists and the third one
- 00:09:37is you can use heaps okay
- 00:09:41now in linked lists you can keep the
- 00:09:43order of this list in increasing and
- 00:09:46decreasing because we don't have to sort
- 00:09:48it because when we start usually we
- 00:09:51start from one order and then when we
- 00:09:53keep on appending we can always check
- 00:09:56and then insert it and the same thing
- 00:09:59with the list as well we start with
- 00:10:01empty lists and then as and when the
- 00:10:03entries are inserted we keep on
- 00:10:05searching for the proper place where
- 00:10:08that order should be inserted and then
- 00:10:09we keep inserting based on the price now
- 00:10:12the time complexity is you know right in
- 00:10:14the in this case of Lists
- 00:10:15the insertion deletion will all be order
- 00:10:19n but in case of access it will be order
- 00:10:22of 1 but in case of linked list it will
- 00:10:24be in order of 1 where this is the idea
- 00:10:27candidate where you can use and also you
- 00:10:29can use heaps where you can use max heap
- 00:10:32and min heap so in this case you can use
- 00:10:37max it because we always want the the
- 00:10:41order with the highest price so we can
- 00:10:43use massive here and we can use min heap
- 00:10:45here and then we can do perform this in
- 00:10:48log in time as well couple of things to
- 00:10:52remember here we need to make all of
- 00:10:54this operation in memory there is no
- 00:10:57escape to that the reason is we want to
- 00:11:00perform all of this so efficiently as
- 00:11:03soon as we write this into database this
- 00:11:06list into the database or into the disk
- 00:11:08we can tree need to function efficiently
- 00:11:11or we can't really process thousands of
- 00:11:14order millions of orders per second so
- 00:11:16key takeaway is we have to consider to
- 00:11:20do the whole thing always in memory and
- 00:11:24the second thing is we need to consider
- 00:11:27that these data structures are whatever
- 00:11:31we are doing should be serializable
- 00:11:33because we always
- 00:11:35want to keep this state somewhere
- 00:11:37periodically we want to take a backup
- 00:11:39and then keep it in the cache so in case
- 00:11:41of our matching engine crashes or the
- 00:11:45machine where this matching engine is
- 00:11:47running crashes we can go back to the
- 00:11:50persistent storage and we can populate
- 00:11:52the list which we had and all the orders
- 00:11:55in it so we get back to the state so you
- 00:11:58can think of the whole thing as a state
- 00:12:01machine where it does all of these
- 00:12:04things so the next thing we need to
- 00:12:05understand is what are the information
- 00:12:07which we need to capture for a specific
- 00:12:09order if it if it was a linked list in
- 00:12:12that case in each node what are the
- 00:12:14information we need to save or if it was
- 00:12:16a list in that case list of object or
- 00:12:19list of dictionary or list of pupil what
- 00:12:21is the information we want to capture
- 00:12:23the first thing is we need to capture
- 00:12:25the ID of the order and the next thing
- 00:12:27is the price of the order at which we
- 00:12:29want to sell or buy the quantity of the
- 00:12:32stocks or the shares which we want to
- 00:12:35buy and the status of that usually a
- 00:12:38queue and once we execute it maybe we
- 00:12:40can change it as executed and once we
- 00:12:42delete it we will not need the status
- 00:12:45anyway but if you want have any other
- 00:12:47use cases you can use that and of course
- 00:12:49metadata where you want us to store any
- 00:12:51other extra information so this could
- 00:12:55look like if you're using lists it will
- 00:12:57be looking language either struct or
- 00:13:00dictionary or it could be even tuple so
- 00:13:03it will be looking like this tuple with
- 00:13:06only values in it something like that
- 00:13:08but if it was a node we'll have all of
- 00:13:11these keys in it and you are basically
- 00:13:13linking it all together let's understand
- 00:13:17the system design diagram in here you
- 00:13:19can see there are users and there is hfp
- 00:13:22users are like any normal user who uses
- 00:13:26UI to interact with the system whereas
- 00:13:29hft basically uses API calls directly
- 00:13:33what does hft stands for is
- 00:13:35high-frequency trading what happens here
- 00:13:38is the companies or individual persons
- 00:13:40have will usually own couple of servers
- 00:13:44and they know sophisticated algorithms
- 00:13:47are machine
- 00:13:48learning techniques to identify when to
- 00:13:51purchase an order and when to sell it
- 00:13:54usually as a user's what we do is we see
- 00:13:58the information on the UI and take a
- 00:14:00decision but in this case HFD consumes
- 00:14:02all the price changes and how many
- 00:14:05stocks were sold and bought all of this
- 00:14:07information and based on that the
- 00:14:10algorithms decide when to place an order
- 00:14:13and when to sell those stocks for the
- 00:14:15profit usually companies make a lot of
- 00:14:18money like crores and crores of money
- 00:14:20using
- 00:14:21hft so that comes to the first entry
- 00:14:25point for all our API is API gateway the
- 00:14:29hf key might be placing a call here or
- 00:14:31users using the UI or mobile apps could
- 00:14:34be placing calls through HTTP REST API
- 00:14:39calls to API gateway so what is the
- 00:14:42functionality of API gateway so API
- 00:14:45gateway takes all the API calls from the
- 00:14:47clients and hft basically all of them
- 00:14:51are clients and then this API gateway
- 00:14:54routes them to appropriate services here
- 00:14:58I have only shown the accumulator and
- 00:15:01risk management system but internally
- 00:15:04they could be having a number of micro
- 00:15:06services they want to make call in this
- 00:15:10case just just for your understanding
- 00:15:13API gateway will mainly perform some of
- 00:15:15the actions like protocol transformation
- 00:15:18where in all of these API gates to a are
- 00:15:22where in all of these ApS are exposed
- 00:15:26using HTTP but internally demo might be
- 00:15:29using some other protocol it could be
- 00:15:30RPC it could be another rest call or it
- 00:15:35could be anything it doesn't matter in
- 00:15:38here usually all the API will be
- 00:15:41interacted using HTTP or HTTPS and
- 00:15:45inside it could be only HTTP so all the
- 00:15:49security layer will be dropped here
- 00:15:50there could be a firewall as well in the
- 00:15:56front that is a web firewall or raph you
- 00:16:00can call it as
- 00:16:01once the request is sent from API
- 00:16:05gateway it goes through accumulator what
- 00:16:10happens in accumulator is all the others
- 00:16:13are given a sequence number based on a
- 00:16:16lot of different criterias for for your
- 00:16:18understanding
- 00:16:19all you have to understand is when the
- 00:16:21order comes in it needs a unique ID the
- 00:16:24unique ID is basically as I need at this
- 00:16:26point of time and you could be thinking
- 00:16:29why in the world we are as an unique ID
- 00:16:33over here why not when we save in the
- 00:16:35databases because you will understand a
- 00:16:38lot of our functionality over here is
- 00:16:41done all in memory without using
- 00:16:43database so we need a unique
- 00:16:45identification number so that will be
- 00:16:48happening here so we will basically I
- 00:16:50didn't assign a unique ID for any order
- 00:16:54request it could be buy or a sell order
- 00:16:56they will assign a unique ID oh here and
- 00:16:59then that order information will be sent
- 00:17:02to risk management system or it is also
- 00:17:06called as RMS usually it is what it does
- 00:17:10is it makes sure's that order is safe to
- 00:17:15be entered into the whole system and
- 00:17:17there are so many checks which are
- 00:17:20mostly to check if the order is within
- 00:17:23the predefined parameters of price size
- 00:17:26capital etc there are I so these are the
- 00:17:31list of risk management usually any
- 00:17:35stock market will do for any given order
- 00:17:39I will be posting some of the links in
- 00:17:41the description you can go and read all
- 00:17:43about it
- 00:17:45so once the order passes through the
- 00:17:48risk management if it is safe to be
- 00:17:50processed that order will actually go to
- 00:17:53the router what happens in the router is
- 00:17:57based on the kind of stock you want to
- 00:18:01sell or buy it will be redirected to one
- 00:18:04of the cube so will be having n number
- 00:18:06of queues which is equivalent to the
- 00:18:09number of companies we will be handling
- 00:18:11so if suppose we are handling 1000
- 00:18:16companies in our system obviously will
- 00:18:20be having 1000 stocks to handle so there
- 00:18:23will be 1,000 Q's over here so all the
- 00:18:26requests say for example all the
- 00:18:28requests for Google stock will be
- 00:18:31redirected over here all the orders you
- 00:18:34know buy orders and sell orders for
- 00:18:37Tesla will be redirected over here so at
- 00:18:40any given point of time in any of the
- 00:18:42queue only similar kind of stocks will
- 00:18:45be there so this router could be
- 00:18:49anything you could implement using Kafka
- 00:18:52streams if you are using Kafka queue
- 00:18:55otherwise if you're on AWS maybe SNS can
- 00:18:59also act as a router based on your
- 00:19:04routing key these queues could be sqs or
- 00:19:07it could be Kafka as well once our order
- 00:19:12is placed over here inside here and this
- 00:19:15is picked up by our matching engine we
- 00:19:20have already discussed how we match it
- 00:19:22it is simple algorithm which runs
- 00:19:25completely in memory so the idea over
- 00:19:30here is so consider if you have three
- 00:19:35queues so there will be one virtual
- 00:19:37machine for our queue okay so this VM is
- 00:19:42responsible to handle Apple and this is
- 00:19:46responsible to handle Google and this is
- 00:19:48responsible to handle Tesla so this will
- 00:19:51be having one process which will be
- 00:19:54reading from the queue and then matching
- 00:19:59by using two arrays one is for by and
- 00:20:02one is for cell so as we discussed the
- 00:20:05algorithm it's the same algorithm or a
- 00:20:07matching engine which will be running
- 00:20:09inside this virtual machine which will
- 00:20:13be using heaps or linked lists or
- 00:20:15whatever array and keeps on consuming
- 00:20:18from the queue and then matches it or if
- 00:20:21you have very less orders per second
- 00:20:24what you could do is you can just have a
- 00:20:27couple of virtual machines and
- 00:20:29then say for example it is this one
- 00:20:38suppose if you're handling very less
- 00:20:41traffic or orders per second you can
- 00:20:43have a couple of virtual machines and
- 00:20:45you can run you know one or more
- 00:20:49instances of matching engine in in the
- 00:20:52same virtual machine and you could be
- 00:20:55consuming from multiple q q q so say for
- 00:20:58example Tesla Google and Apple you might
- 00:21:03be consuming all of that into same
- 00:21:06machine in different processes instant
- 00:21:10you know different processors each
- 00:21:12instance of matching engine will be
- 00:21:14running here and then they will be
- 00:21:16executing in the same machine but it all
- 00:21:20depends on how you want to design it's
- 00:21:21okay
- 00:21:23it based on the requirement and traffic
- 00:21:28so the idea here again is we've so any
- 00:21:33at any given part of time we need two
- 00:21:36matching engine instances per stock or
- 00:21:40two matching engine processors per stock
- 00:21:43and there should be deployed in
- 00:21:45different machines and different data
- 00:21:48centers in this case let's see for
- 00:21:51Google will be having active matching
- 00:21:54engine over here and also a passive
- 00:21:56matching engine over here why do we need
- 00:21:59that way because we need to have high
- 00:22:02availability and also reliability if for
- 00:22:05some reason if our active matching
- 00:22:08engine fails then immediately the
- 00:22:12passive matching engine will take over
- 00:22:14and starts processing the orders and
- 00:22:17then it does all the things which active
- 00:22:21matching engine without done we will be
- 00:22:24needing zookeeper here because we need
- 00:22:28someone to take care of who is the
- 00:22:31active engine and it has to do its all
- 00:22:33other responsibilities in in usual case
- 00:22:37what happens is if both of these engines
- 00:22:40like active engine passive engines are
- 00:22:42always a
- 00:22:43and running then both cannot be updating
- 00:22:47into the databases only the active
- 00:22:50engine should be updating all the data
- 00:22:52into the database and backups and
- 00:22:54everything so we need these systems to
- 00:22:58understand who is the active matching
- 00:23:01engine because both will be running they
- 00:23:03don't have an idea of am i the active
- 00:23:06matching engine so zookeeper will
- 00:23:08basically help you to identify who is
- 00:23:10the active matching engine zookeeper has
- 00:23:13a concept called sequential Z node and
- 00:23:17also FP ephemeral you know what this is
- 00:23:22a short living Z node it only have an
- 00:23:27entry or forgiving server when there is
- 00:23:29always an active connection from any
- 00:23:32machine in this case our epic ephemeral
- 00:23:340 will have two entries one for the
- 00:23:40primary active matching engine and one
- 00:23:43for the passive man you know matching
- 00:23:45engine in the sequential node whoever
- 00:23:48registers first they will have an entry
- 00:23:50and the second will be having so the
- 00:23:52first one is always the active instance
- 00:23:54if suppose the active machine get
- 00:23:57disconnected then this will go away so
- 00:24:00the second will become active in that
- 00:24:01case so this will not be there until
- 00:24:04unless it comes back again and then the
- 00:24:07ID will be here and that will be the
- 00:24:09passive and also one more interesting
- 00:24:13thing you need to know is both of these
- 00:24:18matching engines will be active active
- 00:24:20that means both will be consuming ok
- 00:24:24both will be consuming the messages
- 00:24:26which is sent from this router to this
- 00:24:29group all at the same time that means
- 00:24:31that everything is done over here in the
- 00:24:35same time it could be a little bit of
- 00:24:37difference in the time or it could be
- 00:24:39same but basically the idea over here is
- 00:24:41whatever this pass active matching
- 00:24:45engine is processing in the same time
- 00:24:47the passive matching engine is also
- 00:24:49processing the only difference is it
- 00:24:51will not be writing or sending any
- 00:24:54outputs to the underlying system but the
- 00:24:56active
- 00:24:57matching agent will be keep on sending
- 00:24:59why do we have to do that is if this
- 00:25:03goes down this passive matching agent
- 00:25:06should take over immediately so it is
- 00:25:07already in the same state of the active
- 00:25:10matching engine so this kind of design
- 00:25:12is needed because we are doing
- 00:25:14everything in memory so that's the
- 00:25:19reason why we need both of these active
- 00:25:22and passive matching engine to keep on
- 00:25:25performing the same things so what
- 00:25:30happens later there once the active
- 00:25:36matching engine figures out a perfect
- 00:25:38match for by order in the sales selling
- 00:25:42array or linked list it what it does is
- 00:25:45yeah not here it actually sends a
- 00:25:49message to another router saying that I
- 00:25:53actually found a matching order and then
- 00:25:56it is sent to a lot of different
- 00:25:57partitions so in case of Kafka the N
- 00:26:02number of threads are a number of nodes
- 00:26:05in the consumer side we will be having n
- 00:26:07number of partitions you can design here
- 00:26:10as well something like how we did it
- 00:26:12here like you can have individual queue
- 00:26:15for each stock or you can do it as a
- 00:26:18partition and only the server consumes
- 00:26:21every stock what what is the idea of the
- 00:26:25primary data server is it basically
- 00:26:28computes the stock tix star-struck tech
- 00:26:33looks like this it is simply the
- 00:26:37representation of minimum representation
- 00:26:41or measurement of the minimum upward or
- 00:26:43downward moment in the price in this
- 00:26:45stock exchange the server once the order
- 00:26:49is matched it receives the information
- 00:26:52about the order or the trade we just
- 00:26:54happen and then it is going to convert
- 00:26:57the price changes into the tip and that
- 00:27:01changes will be written into our DBMS
- 00:27:04and the idea in the database is that
- 00:27:08this our DBMS is sharded you can
- 00:27:11shard it into shared by stock so
- 00:27:15basically you'll have one shard
- 00:27:18per stock so this could be Apple shard
- 00:27:24this could be Google shard so you will
- 00:27:28be maintaining thousands of charts or if
- 00:27:31your data is not too much you can use
- 00:27:34hashed charting or any other strategy
- 00:27:37it's up to you so usually it's a good
- 00:27:39way to charge buy the stock itself so
- 00:27:42you have all of the specific stocks data
- 00:27:46in one shard and it will be easier to
- 00:27:49scale and you can have more number of
- 00:27:51writes and reads achieved in the same
- 00:27:53shard itself and also this primary data
- 00:27:56server is going to write into to the
- 00:27:58time series databases which will be used
- 00:28:01to render the graphs like time did the
- 00:28:06price of variation at any given point of
- 00:28:10time and and something like that
- 00:28:13so time series are really good in that
- 00:28:15so who is going to consume that data is
- 00:28:17the UI and the front-end application
- 00:28:21needs to show graphs like this one so it
- 00:28:25you know always need time series
- 00:28:27databases it is not really a good idea
- 00:28:29to use a DBMS for that pattern and also
- 00:28:33all of this information is also consumed
- 00:28:35by your stream processing engine where
- 00:28:37you can do a lot of other streaming
- 00:28:41processing like fraud detection other
- 00:28:44machine learning whatever you want to do
- 00:28:46basically will be having a you'll you
- 00:28:49will be having a stream processing over
- 00:28:50here so I guess we did a one cycle of
- 00:28:55how an order starts its journey from an
- 00:29:00API gateway into risk management router
- 00:29:02and it goes to Q from the queue it goes
- 00:29:04to active matching engine from matching
- 00:29:07engine it comes to router and here and
- 00:29:10then saves and everything so meanwhile
- 00:29:13this primary data server will be will
- 00:29:18can talk to this payment system and
- 00:29:21deduct the amount or money from
- 00:29:25the users wallet our account once the
- 00:29:28matching is done and that information is
- 00:29:31also updated over here okay so we need a
- 00:29:35payment payment system that itself is a
- 00:29:37big system I don't want to talk about it
- 00:29:40now there if the user want to add some
- 00:29:44money you can actually make this API
- 00:29:46call and redirect it to bank or whatever
- 00:29:50it is
- 00:29:51so he'll be loading all the money into
- 00:29:53the payment wallet so this system will
- 00:29:56basically take care of that and you can
- 00:29:59also see that the UI is coupled with
- 00:30:03Syrian or the aps can be this ApS can be
- 00:30:08coupled to the Syrian to cash some stuff
- 00:30:10like like the graphs I showed you here
- 00:30:13because this data is mostly once it is
- 00:30:16there it is always historical data so
- 00:30:20you don't need to really query the time
- 00:30:23series database all the time so you can
- 00:30:25cache that as well so the other
- 00:30:27important thing we need to discuss is
- 00:30:29how the reliability of this matching
- 00:30:33engine we already spoke about passive
- 00:30:36and active engine what happens in
- 00:30:38certain cases that both machine is down
- 00:30:40in that case how the things work is on a
- 00:30:44specific duration or interval like say
- 00:30:47every one minute once data in the active
- 00:30:52matching engine will be dumped into you
- 00:30:56know Redis or any storage engine the
- 00:31:00idea here is as I already mentioned in
- 00:31:04the algorithm your data structure
- 00:31:06whatever you use in your algorithm
- 00:31:07should be serializable
- 00:31:10okay so that way you will be dumping all
- 00:31:14the recent data into the snapshots so in
- 00:31:20other words in the worst case if both
- 00:31:22matching engine is down so when they
- 00:31:26come back they can read all the latest
- 00:31:28the last-known state of the machine from
- 00:31:32here and and and that way you will get
- 00:31:37back all the order
- 00:31:39our orders which were there already in
- 00:31:41the order book and one more thing you
- 00:31:43also need to understand is Kafka does
- 00:31:47support transactions and manual
- 00:31:53acknowledgments which you can actually
- 00:31:57manual acknowledgments and transactions
- 00:31:59which you can actually use it to build
- 00:32:03your matching algorithm as well that way
- 00:32:06you can make it make the system more
- 00:32:08robust also as and when you receive your
- 00:32:12order in the active engine or passive
- 00:32:15engine you can also write a copy of
- 00:32:18order into Cassandra this is not really
- 00:32:24the job of matching engine there could
- 00:32:26be one more thread which is running that
- 00:32:30could keep on writing into the Cassandra
- 00:32:32so that way after you get all the latest
- 00:32:35information from the snapshot you can
- 00:32:37query the remaining order information
- 00:32:41from the Cassandra and then load it back
- 00:32:43to your matching engine memory so you
- 00:32:46have the latest state available I mean
- 00:32:49here there is a lot of different
- 00:32:51strategy you can use it somehow you need
- 00:32:53to persist the latest known state into
- 00:32:56Redis and then get back that information
- 00:32:59so that's about it
- 00:33:02I guess I have covered most of the
- 00:33:04information you need there the other
- 00:33:07things to understand here is this
- 00:33:12primary data server is also signing the
- 00:33:15information to the pop servers so what
- 00:33:18pop servers does is they are basically
- 00:33:21providing the updates to the users or
- 00:33:26hft in real time like basically the
- 00:33:28broadcast using WebSockets are using
- 00:33:32tcp/ip or UDP ports the idea here is you
- 00:33:39should be able to send the price changes
- 00:33:42or ticks as soon as possible to the end
- 00:33:48users because H of T is high-frequency
- 00:33:51trading where they will
- 00:33:53deciding what stocks to purchase or sell
- 00:33:56in you know in terms of ten milliseconds
- 00:34:01precision so they need to know as soon
- 00:34:04as possible so they will be listening to
- 00:34:08these pop servers there are other
- 00:34:10strategies as well in in real exchanges
- 00:34:13what they do is they specifically as an
- 00:34:15only specific number of connections to
- 00:34:18each servers because they don't want to
- 00:34:21overload and also they try to instead of
- 00:34:23looping through each and every
- 00:34:24connection instead of looping through
- 00:34:27each and every connection and
- 00:34:28broadcasting is they try to broadcast it
- 00:34:31instead of just looping through because
- 00:34:33when they loop through there were very
- 00:34:35first connection might get the the the
- 00:34:41tick information of the prize
- 00:34:42information and then the last connection
- 00:34:44might get a little slowly so they don't
- 00:34:47want to lose out that as well so they
- 00:34:49want to receive as fast as possible
- 00:34:51there are a lot of strategies over here
- 00:34:52as well that is out of the subject right
- 00:34:55now
- Softwarearchitektur
- Börsensystem
- Kryptowährung
- Forex-Handel
- Matching-Engine
- Echtzeitdaten
- Risiko-Management
- Skalierbarkeit
- Hochfrequenzhandel
- Systemzuverlässigkeit