TCP - A story about hope, (pkg) loss && the missing link

Ola Gasidlo

Recorded at JSConf EU 2018


Get notified about Ola Gasidlo

Sign up to a email when Ola Gasidlo publishes a new video

let's have a huge round of applause for
Ola we just have twenty five come on
like the very last talk of the sidetrack
for 2018 did you enjoyed the conference
before who did who lost the voice nice
if you need medicine so my name is Ola
I'm an engineer it was a love you might
had heard I do mostly performance
compatibility specs CSS JavaScript just
the cool stuff today we'll try to
understand how TCP works surprise how
it's actually delivering those packages
everyone's that can't talking about and
maybe we even learn what happens to the
ones that are lost okay let's jump right
again
let's take a look and to Internet
Protocol family you might recognize one
or two of them it has just very few
members like 500 so it's fine so those
two see here probably the most common do
not recognize any like it's P FTP SMTP
and when you're as old as I am telnet
they're cool and they're like also
different models that group them
together to make the roles kind of more
understandable this for example is the
OSI model which groups the protocols in
seven layers it proves the protocols
unlike different layers for the entity
like the network layer application layer
totally makes sense right there's also
the well the most important one the
eighth layer everyone forgets but it's
the also the hardest people so there's
also the DoD layer model right
it stands for Department of Defense
because they develop them just as four
layers most but it's kind of the most
common in our
our daily architecture the duty Mulgrew
the protocols that can add a little bit
different way you can see the relations
in the graphics I added at the relations
to do some model data the dod lady motor
it's also called the tcp/ip model
because of both are kind of the
foundational protocols in the suit so
what they even do I don't know
so what are the rows IP is short for
Internet Protocol it's here to give the
sender and receiver kind of numbers so
the packages know where they're coming
from and where to go like this one we
could also go with names like our school
part but computers are not that good
with words so we stick with numbers
another thing it does is actually to
check for like the closest router
whether for the data to so have a
package to be like you catch and it will
be fine DCP is sure it for our
transmission control protocol as the
physical network surprises kind of not
reliable this protocol is taking care of
the transmissions of the data so it
creates a reliable network for you make
sure everything gets from one point to
another especially with something so
wibbly-wobbly like the physical network
it's a good thing to have at least the
Sharks stopped chewing on the undersea
fiber cables that's a good thing so what
does see TCP do and not and make sure
your data arrives in a specific order
make sure data has like minimal errors
so it's actually correct makes
duplicated data like discard it make
sure lost packages are resent and
protects the network from being
overloaded we'll get to that later what
it does not do it just doesn't care
who's sending and receiving kind of it's
just this cue delivery bear which is
so do you understand the roles yes I
know it's late wake up
yes that's really good so we get to the
next point packages I gonna tell story
makes things easier okay so imagine your
friend is sick happens and it's a
good friend they love to read so you
want to borrow them book this is one of
my favorite books and you can't go there
because reasons and your roommate is so
nice we'll just basically grab this book
and bring it there so this is what they
do
the roommate takes the book bikes over
your friends is super happy they come
back congratulations you just delivered
a package okay okay but your friend is
like really sick like you don't want to
go so your roommate is really cool and
you decide just to put together like a
small package to make them feel better
so you like grab some tea and candy and
cookies sort them in three boxes like
those you bought at IKEA like 10 years
ago in the back of your cupboards right
just take them
small same size put numbers on them
because your friend is actually is
really sick and might forget that I got
three and which ones that's fine so your
roommate grabs them bikes over your
friend is like that's so nice of you
thank you but send a small note for you
comes back sounds reasonable
read congratulations you just understood
how a round-trip works good job so those
packages are literally nothing else than
boxes that contain like candy tea
cookies but o.data whatever their
purpose is basically just a group tip
those bits and bytes and every box is a
number so you know the order and
especially if all of them arrived that's
cool
okay we got that now so now the most
important and biggest part we need to
try to understand how actually the
transfer works and explaining
with data it's kind of boring right who
likes pancakes pancakes they can be
split up in like small units you can
slap stuff on that how much you want
whatever you want that's amazing they
can stay in like C warm those boxes and
fresh so let's stick with pancakes every
transfer starts with the so called
three-way handshake the three-way
handshake is here to make sure both
sides are like the correct ones and
aware that the transfer file the file
transfer actually happens this is how a
three-way handshake looks like might
remember from school
we have sender receiver things happen
people get scared it's correct but I
don't like that much so I was in the
playground the other day I was reading
this really good book I do read a lot
and my six-year-old was like what are
you reading to see people through
handshake explained it to me so I did
she was like mom this is easy I'm gonna
come to the fore now I want to make you
feel the same way
we'll get there so let's translate this
in human so let's meet our sender and
receiver this is Stefan
his engineer who loves baking and
cooking fathers this is Dominic also an
engineer who likes to travel and because
they travel so much home-cooked meals
both of them have used other things in
common like speaking at conferences
wearing onesies I'm sure and especially
surprised right ok so imagine Stefan
Stefan has made some pancakes don't make
one Sam but they both really want to be
sure that the transfer works because
hangry Dominic it's nothing that anyone
wants just me he's my neighbor and all
pancakes actually arrive at the correct
destination right so this is what we
performed this through a handshake
Stefan grab some paper picks a random
number 23 it is he writes it on a piece
of paper this is what we call a sink so
as sin is a request to synchronize okay
Dominic takes a look at the nude
performs an action which stands for
acknowledgement and therefore he that's
the number one so he's like good so
let's even know that he received the
note he's thinks about number two but
pancakes and it's hard but he gets there
he takes seven greats so he sends that
out no it goes back to Stefan and he
performs the AK it's one two Dominic's
number and we're good to set those
pancakes okay
so this little notes are not just love
notes there are the TCP headers they
contain all the information that TCP
packets need and this is how they look
in real life yeah let's just say this
event go let's take a look at the CP
headers the moment Dominic
sends it back to Stefan right
so that we the sender therefore the
source port Dominic destination port
Stefan Dominic's random number was seven
you might remember so the sequence
numbers also seven when data was
transferred the length of the data and
bit would have been added to that number
to the sequence number but we didn't
which is just a handshake so it's a
seven seven pin 23 we got an AK plus 1
so that's acknowledge number the next
will be data offset which pins down
exactly where the TCP header ends and
the package it's reserved means three
bit bits for Newco flags you solve once
some of the flags which is next to it
already like syn ACK fin
because by window contains the size of a
data the sender or receiver can handle
so it's the size of a pancake batch
depending on the direction it's either
one the one from the sweeper or the
sender because the switch right so we
get back to that later
checksum minimal possible security
effort it's right argent pointers
possibility to say hey that packets more
important the others
fast forward hint don't option skip you
like details black person pointers and
other things options are good we like to
have options filler padding is also a
great performance improvement so we
assumed 23 32 bits and if the header
does not fill this up heading will there
cool thing is we don't have to calculate
bit by a bit by a bit just annoying just
like steps and that helps they don't
pancakes right that's that's obvious so
okay the blue ones you see here we
talked about already the yellow ones
would take a closer look at it so
pancakes and size of the batch and the
green ones are a little bit more
advanced very interesting but we've not
gonna do this today so
you might have noticed Stephane and
Dominic hi so as I mentioned Dominic is
hungry how would you feel about we
actually fed them pancakes
back here and now I mean I I'm a parent
always have a fresh batch of Pancakes my
back so but okay before we start we
really want to make sure he does not
overeat and get sick we need your out
this is why we need to check how many
pancakes you can actually eat without
getting sick anticipate this mechanism
is called flow control it's here to
prevent the sender from overwhelming the
receiver with pancakes possible reasons
are Dominic is busy right now with like
getting some cluttery right or because
I'm scared is receiving some berries
maple syrup and maybe even bacon from
someone I don't know and another
transfer oh it's just a very fixed
amount of buffer which translates in
human as a plate can hold just that many
pancakes right so here you can see a
very simplified version of the round
trip usually it looks like this if you
don't have enough space let's stick with
this cool
Dominic receives pancakes but doesn't
send anything right now except us a love
letter right cool
and that now me story like the value for
how many pain case he can put on
Dominic's plate per package this way
make sure that enough no leftovers does
not over eat cool but he still gets as
much food as possible
so for every AK that all pancakes have
arrived to rights like this little
number on a note
the smell is called receive window size
or our W and D in short which is like
the available buffer space plate
Dominic would let us know every round
shape how many pancakes he can receive
and how much space is got left same goes
for Stefan if he would receive some I
don't know
bacon maybe but I'm sorry buddy I don't
think you will
today the maximum possible value is 1
gigabyte it used to be 64 KB it was a
little bit and it's still very likely to
differ because of configurations network
round-trip time and this is hard to
remember just imagine like one gigabyte
wait what about the sender in the
network that's important right come on
two more minutes we don't want to
overwhelm them so this is why we have
congestion window size yes
CW Android they are not sense just known
they'll just know another person that
since now this value defines how many
segments can be sent out per package
because that's later
so segment is kind of a batch of
Pancakes and so-called TCP protocol data
units defined by the connection they're
actually using so a batch of Pancakes is
smaller when you use some modem and
bigger the big cakes when use the fiber
cable connection the size of congestion
window size can never be bigger than the
receiver window size because at play can
have to so many pancakes
so round-trip the congestion window size
keeps like increasing up to the maximum
receiver window size until the network
reaches its limit or you don't have
anything left to send does it make sense
your weight awesome it's good that
you're awake because now we can transfer
I start to transfer
look everyone who sits in between them
welcome your the data pipeline so how it
works you start at the package you do
your stuff fast as possible it's all
about performance right we need it small
round trip time okay
first pancakes let's go let's go I'll
keep them in they keep them in the
package that's fine that's fine come
come on we don't have much time don't
close it show Shane go go go go go
faster faster tell me
[Music]
you got knock come back don't even
close it it's all about the performance
oh come on
this takes forever and the next
patch on the next match for us was so
now by the way we have four segments we
increase being trees because she's angry
she's eating so so what's your receiver
go go go we got that down that's that's
faster look that give us faster than you
look
and come on that last round trip the
last one the last one the last one wait
did someone steal pancakes
[Music]
thank you so much there might now by add
to solve it by the way buddy it's like a
big applause for everyone thank you so
much good job
[Music]
so you might have realized two things
except of I hopefully I hope are you
okay now okay yeah cool okay so there
were two things you might have
recognized one was we actually kind of
increased the segments right around chip
you totally saw that the other was
package loss because what's this random
they increase my pancakes bosses random
surprise no this mechanism is called
slow start and it's optimized for
anticipate what's also does is estimated
available capacity between like the
sender and receiver how available
capacity by exchanging data so you
remember I was talking about segments
right batch of Pancakes the reason why
we have to find the congestion window
size to make sure we do not overwhelm
the network or the center they never
start like the full bandwidth they never
do the slow drive really slow and small
and check out basically how stable the
network is and how much it can handle so
simplified slow start this so you have
time and status of pancakes congestion
control know that congestion we know
size default today in a default is 10
used before we did too
that's just amazing let's go with that
so here the value you can see here is
two right so when we received everything
you get an active and we measure at the
round-trip time then the algorithm the
it's like incremental so it doubles up
and then we can they take for do the
same and when everything arrives it's
again an a key round to time measured
than 8 then 16 this is how the algorithm
looked like I just explained to you
let's assume it just lost pancakes right
okay disappearance and those packages
because TCP is awesome it loves you
takes care of you right but it will
probably fail again because that work
apparently can't handle that many
segments so packets loss
surprise is actually not a bad thing
it's a built-in mechanism that tells us
that we actually overwhelmed with
network to avoid losing like more
packages and the network we have
something that is called congestion
avoidance feedback mechanism package
loss showed us a lot all along the way
that something we encounter in a
congested link or a router this is why
we need to adjust the CW and right okay
so what we do is we check out which was
the last value that worked resets our
cwnd back to that and we will basically
grow from there
we will run into back at us at some
point again just to the algorithm again
then we'll just adjust so latest value
that word was eight right cool so we
take that and we don't double it up we
just take half of it and put it on I
would before and then twelve plus six
and goes on until we lose packages again
the algorithm is like the really tricky
part here when it's too aggressive it's
cutting down the increase and you're
well a connection will be bad if you
don't adjust quickly enough great then
it will like introduce more package loss
so prr is really cool because those
algorithms are repeatedly and constantly
worked on and this is one of the good
algorithms that might cut off like three
to ten percent of your round-trip time
and it's really nice so all of that
sounds amazing
do you love TCP
there are limitations so those of us we
transfer data every time we hit like the
limit and wait for the ACK that's like a
one round-trip time of delay right
either the sender or the receiver are
forced to stop and always wait for the
ACK hey and that if it's like a very
long time it can cause like massive gaps
in the data flow so the effect is called
bandwidth delay products round-trip time
here is like our bottleneck and the
connection is when it's unlike
unreliable the times can like vary a lot
so the good idea here is to optimize
your windows size values so that the
round-trip time would be enough to get
the app back and the bigger the window
size is without packet loss and the
smaller the round-trip ties the better
you can also use window scaling because
it helps you with that
since this EP is an abstraction of a
reliable network over like unreliable
connection it excludes like really cool
features we talked about like basic
package error checking an order delivery
retransmission of loss packages flow
control congestion control and avoidance
that's really cool and those features
are built in for like the greatest
efficiency possible but they also limits
kind of of what you can do of course
been something a special special we made
something else it does kind of line
blocking is like the loss limitation I'd
like to point out as we have like
reliable and order delivery built in we
kind of rely on all segments to arrive
cool so when the three segments are sent
out and just the first one of one of
those is lost it will drop all packages
so the first one was lost he's was
waiting for the first one and everything
after was like and I'm a first one I
don't care about the others so Stefan
will basically resend the packages until
the first one arrives then a second and
this might cause that connection to die
at some point because we will be
overwhelmed with resending there's no
way around this and consider everything
something else if you don't need a
reliable and order Network and please
don't build stuff because hear things
like UDP is great I just need a very few
features you will realize after like six
months
oops every bill TCP okay that was it had
an amazing ending for you but you might
add a package got actually lost thank
you