00:00 the world's smallest PNG smallest PNG
00:02 file is 67 bytes it's a single black
00:05 pixel here's what it looks like zoomed
00:07 in 200x it's probably more or less Zoom
00:09 depending on your screen's aspect ratio
00:11 resolution a bunch of other things but
00:12 according to my browser right now this
00:14 time this is a 200x of the single black
00:17 pixel what makes this the smallest
00:18 possible PNG why is 67 bytes the limit
00:21 I'm a big nerd about codex and AV stuff
00:23 but as soon as I saw this article I knew
00:25 I had to read it I was just so curious
00:27 so let's dive in on what makes this PNG
00:29 67 bytes and hopefully as we look
00:32 through this learn a bit about how pgs
00:35 work wow what a beauty this file has
00:38 four sections one the PNG signature the
00:40 same for every PNG 8 bytes two the
00:43 image's metadata which includes its
00:45 Dimensions just 25 bytes three the
00:47 image's pixel data which is 22 bytes and
00:49 then four an end of image marker which
00:53 bytes the rest of this post describes
00:55 this file in more detail and tries to
00:57 explain how pgs work along the way so
00:59 big twist at the end that excites you
01:01 but I hope you're just excited to learn
01:02 about pngs I know I am so let's dive in
01:05 part one the PNG signature every single
01:08 PNG including this one starts with the
01:10 same eight bytes encoded in HEX those
01:12 bittes are these I actually didn't know
01:14 that they were always the exact same
01:16 eight bytes that kind of makes sense but
01:19 surprising I wonder how many times these
01:21 eight bytes exist on the internet how
01:23 much repetitive unnecessary so to speak
01:25 storage has been used for this I'm sure
01:27 there's plenty more in JavaScript but
01:29 it's still interesting in to see this is
01:30 called the PNG signature try doing a hex
01:33 dump on any PNG and you'll see it starts
01:35 with these bites PNG decoders use the
01:37 signature to ensure that they're reading
01:39 a PNG image typically they reject the
01:41 file it doesn't start with the signature
01:43 data can get corrupted in various ways
01:44 if I had to file with the wrong
01:45 extension and this helps address that
01:47 fun fact if you decode these byes as aie
01:49 you'll see the letters p and g in there
01:52 interesting did not know that would do
01:54 that fascinating so that's the first
01:56 eight bites one part done now we need to
01:58 go to the image metadata chunk
02:00 the next part of the PNG is the image
02:02 metadata which is one of several chunks
02:04 what's a chunk oh boy we've never talked
02:06 about chunking for files on this channel
02:08 now have we it's a really good
02:09 opportunity to do such quick intro to
02:11 chunks other than the PNG signature at
02:13 the start pgs are made up of chunks
02:15 chunks have two logical pieces a type
02:17 and some datab byes types are things
02:19 like image header or text metadata the
02:21 data depends on the type the text
02:22 metadata chunk is encoded differently
02:24 from the image header chunk these
02:26 logical pieces are encoded with four
02:28 Fields these fields are always in the
02:30 same order for every chunk they are
02:32 length which is a number of bytes in the
02:33 chunk data field the chunk type which is
02:36 the type of Chunk that you're currently
02:37 reading the data which is the actual
02:39 content of it and then a check sum to
02:41 make sure the data wasn't corrupted this
02:42 is only four byes long as you can see
02:45 each chunk is a minimum of 12 bytes four
02:47 for the length four for the type and
02:48 four for the check sum know that the
02:50 length field is the size of the data
02:52 field not the entire chunk if you want
02:54 to know the whole size of the chunk just
02:56 add 12 four bytes for length four for
02:58 the type and four for the check sum you
02:59 have some wiggle room but chunks have a
03:01 specific order for example the image
03:03 metadata chunk has to appear before the
03:05 pixel data chunk once you reach the
03:07 image is done chunk the PNG is done an
03:10 interesting thing about these chunks if
03:12 you've ever seen the old Progressive
03:14 loading of the web prior specifically
03:16 with jpegs those chunks could come in
03:18 piece by piece or row by row which is
03:20 why we've all seen old websites where
03:22 individual rows of pictures came in one
03:25 at a time and we had the slow like
03:26 scrolling load this idea of a baseline
03:29 jpeg versus Progressive where the
03:30 Baseline jpeg is loading the chunks in
03:33 order so you see more and more content
03:35 where a progressive jpeg has a blurry
03:38 minimal version of the image first and
03:40 then you get sent more data over time
03:42 PNG has no concept of a progressive JPEG
03:44 and if I recall actually requires you to
03:46 have most of the data before it can even
03:48 render the image but by putting the
03:49 metadata so early they now know the
03:51 dimensions and they can fill additional
03:53 chunks in over time this allows you to
03:55 do that type of progressive loading but
03:57 it doesn't allow for the enhancement of
03:59 the way that Progressive jpeg
04:01 does and again once you reach the image
04:03 is done chunk the PNG is done our tiny
04:05 PNG will have just three of these
04:08 chunks the image header chunk this is
04:10 the first chunk of every PNG including
04:12 ours is of type ihdr short for image
04:15 header each chunk starts with the length
04:17 of the data in that chunk the ihdr chunk
04:19 always has 13 bytes of associated data
04:22 as we'll see in a moment 13 is 0 D in
04:25 HEX which gets encoded like this 000000
04:28 0d chunk type is next this is another 4
04:31 bytes ihdr is encoded as 49 48 44 52
04:36 this is just assy encoding chunk types
04:38 are made up of ay letters the
04:39 capitalization of each letter is
04:41 significant for example the first letter
04:42 is capitalize which means it's a
04:44 required chunk next the chunk data ihdr
04:48 data happens to be 13 total bytes
04:50 arranged as follows the first eight
04:52 bytes encode the images width and height
04:54 because this is a 1 by one image that's
04:56 encoded like so 000000 0 1 0 0 1 if you
05:00 know hex you know how to increase these
05:01 accordingly for the number of pixels if
05:03 this was 20 and this was 20 that would
05:06 be the hex equivalent which would be 16
05:09 yeah that would be 16 so if you had 16 x
05:11 16 image so would be sorry one Z one0
05:14 hopefully that makes sense anyways next
05:17 two byes are the bit depth and color
05:19 type these values are probably the most
05:21 confusing part of this PNG there are
05:22 five possible color types our image is
05:24 black and white so we use the grayscale
05:26 color type which is 0 0 for image Head
05:28 color we might use the true color type
05:30 which is 02 there are three other color
05:32 types which you don't need today but you
05:33 can read more about them in the PNG spec
05:35 this is interesting the header actually
05:37 has color signatures for PNG I didn't
05:39 actually know about this I knew a bit
05:41 about true color but I didn't know these
05:42 were just hard-coded number values in
05:45 the header once you've picked a color
05:46 type you need to pick a bit depth the
05:48 bit depth depends on the color type but
05:49 usually means the number of bits per
05:51 color channel in an image for example
05:53 hex colors like Fe 9802 have a bit depth
05:55 of eight eight bits for red eight for
05:57 green eight for blue are all black image
06:00 doesn't need all that we only need one
06:01 bit pixel's either completely black zero
06:03 or completely white one in our case it's
06:05 completely black if we picked a more
06:07 expressive color type and bit depth we
06:09 could make the same one by one image
06:10 visually but the file could be bigger
06:12 because there could to be more bits per
06:13 pixel that we don't actually need for
06:15 example if we used the true color type
06:17 and 16 bits per Channel each pixel would
06:19 require 48 bits instead of just one not
06:22 necessary to encode completely black
06:24 interesting so this detail of gray scale
06:27 versus true color is valuable when
06:29 you're trying to make the smallest
06:30 possible PNG because when you are
06:32 grayscale you don't have as much bit
06:34 depth for each pixel I did see somebody
06:36 dropped all of the different formats
06:38 here it did not that's fine that's
06:40 twitch chat so pallet based pallet based
06:42 with transparency grayscale grayscale
06:45 transparency gray scale with Alpha
06:46 channels RGB and RGB with transparency
06:48 RGB with Alpha Channel if you're not
06:50 familiar with pallet based colors it's
06:51 actually really interesting I'm going to
06:53 quickly command f for pallette which
06:55 ises not appear in here so I'll go on a
06:56 quick tangent for color Petes the idea
06:59 of a color palette is really interesting
07:01 you might notice that a lot of these
07:04 gifts as they come in and out have way
07:06 more detail than the base version which
07:09 tends to have a very small number of
07:10 colors the reason for that is in the old
07:12 days we couldn't just have a ton of bits
07:16 for every single Pixel the way they
07:18 would handle this is you would have a
07:19 limited palette of the colors you're
07:21 allowed to use and then instead of
07:23 referencing a specific color like red
07:25 and all of the bites for that you would
07:27 reference color one out of your eight
07:30 color palette this is where 8 bit kind
07:32 of came from originally the idea that
07:34 you have these eight colors you have
07:36 predetermined and then each pixel
07:38 references one of those eight pallet
07:40 based images also known as color mapped
07:41 or index colored images use the plte
07:44 chunk and are supported in four pixel
07:46 depths again the point of this is
07:48 instead of each pixel having a much
07:50 larger amount of data that represents
07:52 the color of that pixel it starts with
07:54 all the potential colors and now all
07:56 future pixels can just reference an
07:58 index from that list list instead of
08:00 having to determine the whole color so
08:02 if you have two red pixels instead of
08:04 both of those pixels describing red both
08:06 of those pixels can just list the index
08:07 for where red exists in that color
08:10 palette color palet based stuff is not
08:12 as common anymore in fact when I say
08:14 color palette most people are probably
08:15 thinking of a tool in their graphics
08:17 editor that has a set of predetermined
08:18 colors they can use but this was how
08:21 game consoles used to work this is how
08:23 encoding images efficiently used to work
08:25 these tools and Technologies existed to
08:27 make graphics in the first place and
08:29 it's a really important stepping stone
08:31 as you start to understand how all these
08:33 pieces come together we see pallet based
08:35 with transparency transparency is a
08:38 really complex thing because it's not on
08:40 or off transparency is a spectrum is it
08:43 10% opaque is it 100% opaque is it 0%
08:46 opaque so you end up having to have a
08:48 whole separate Channel usually two
08:49 additional bits just to describe how
08:52 deep each pixel is as such a lot of
08:55 things just don't support transparency
08:56 and something like video there are no
08:58 major video standards that support
09:00 transparency that actually support it
09:02 properly in the browser believe me I
09:04 would know I have tried a lot you can
09:06 play back video with transparent stuff
09:08 in it sometimes but you cannot record
09:10 transparent video through the browser
09:12 and send it somewhere else it's
09:13 transparency is a huge huge rabbit hole
09:16 with all of this because we've added an
09:18 additional channel to every pixel and a
09:19 lot of things don't know what to do with
09:21 that channel this is also why pgs can be
09:23 transparent though versus jpegs which
09:25 don't have any concept of that
09:26 additional Channel nothing about
09:27 transparency back to the smallest p
09:29 hopefully you guys are starting to see
09:30 how nerdy and obsessed I am with this
09:31 stuff it is really fun let me know in
09:33 the comments if you're enjoying this
09:34 because I I want to do more videos about
09:36 AV stuff and avtech but people don't
09:38 normally seem to care so let me know if
09:40 I'm wrong and you really like this with
09:42 a bit depth of one and a color type of
09:43 zero we encode these values with 0 0 01
09:47 very simple the next bite is the
09:49 compression Method All pgs Set this to 0
09:52 0 for now this is here just in case they
09:54 want to add another compression method
09:55 later as far as I know nobody has have I
09:57 learned anything from my time with codec
09:59 is that there's a lot of these types of
10:01 things in it where they leave some
10:04 amount of space in their definition for
10:06 oh it'd be cool if somebody adds this in
10:07 the future will will'll leave this gap
10:09 for it and then it just never gets
10:10 filled and as a result half or more of
10:12 the images on the internet have a bunch
10:14 of bites in them that don't actually
10:15 communicate anything they're just
10:17 placeholders for theoretical Futures
10:19 which it's interesting that even in the
10:20 smallest possible PNG we have to set
10:23 this to 0 0 because everything always
10:25 has that same for the filter method
10:26 which is always 0 0 last part of the
10:28 chunk data is the interlace method PNG
10:30 support Progressive decoding which
10:31 allows images to be partially rendered
10:33 as they download we aren't going to use
10:34 this feature so we'll set it to 0 oh
10:36 this actually is what I was talking
10:37 about earlier I didn't know PNG just had
10:39 this as part of the standard like that
10:41 but you can interlace the way the chunks
10:43 are spread out so you can take a fourth
10:45 of the pixels from every fourth row to
10:48 give you a blurry version and then go
10:49 back and get the rest over time and that
10:51 is part of the standard and you can
10:52 describe that by putting an interlace
10:55 method and then finally every chunk ends
10:58 with a 4 by check it uses a common
11:00 checksum function called the crc32 and
11:02 uses the rest of the chunk as an input
11:04 Computing the check sum gives us the
11:05 following byes this is the check sum for
11:08 everything that's come up to this point
11:10 altogether here is the whole image we
11:12 have these bytes which describe the data
11:14 length this which describes ihdr ay the
11:17 width the height bit depth the color
11:19 type the compression method filter
11:22 method interlace method and then the
11:24 check sum and that's just the metadata
11:26 at the start this is all of the things
11:28 that we have done before we've even
11:30 started to render the
11:31 image so let's start the actual pixel
11:35 data our next chunk is idat which is
11:37 short for image data so since it's an
11:39 image data Chunk we need to encode that
11:41 it is that and this is the short code
11:43 for it so we have 10 bytes of data we'll
11:45 talk about what it is promises 10 bytes
11:47 now let's encode I do for the chunk type
11:49 here we are again just assy of the
11:51 values and now for the interesting part
11:53 the image data first step is uncompress
11:56 pixels image data is encoded in a series
11:58 of scan lines and then compressed scan
12:00 line represents a horizontal line of
12:02 pixels for example a 1 12 3x 456 image
12:05 has 456 scan lines again if we look at a
12:08 baseline jpeg in the top to bottom way
12:10 the image loads in you'll see that this
12:12 is because of those scan lines every
12:14 time a scan line has loaded on your
12:15 device it can render it and those are
12:16 coming through horizontally this is also
12:19 how a lot of displays work where they
12:21 render top to bottom which introduces
12:23 things like vertical tearing which if
12:24 you've ever played a game on a device
12:26 before you've probably seen this where
12:28 tear point one happens with the trees
12:30 there I'll zoom in a bit so you can see
12:31 it better you can see these trees in
12:32 this ground are misaligned that's
12:34 because if this was a video by the time
12:36 you're a certain distance down the
12:38 monitor the image the monitor is
12:40 rendering has updated so you have the
12:41 next image and the rest of the monitor
12:43 will update with that and you get
12:45 another image and the rest of the
12:46 monitor updates with that because the
12:47 monitor updates top to bottom this is
12:50 why getting these things correct is so
12:52 important vsync helps guarantee that
12:54 your monitor will always run the same
12:55 speed as the frames it's getting to
12:57 prevent this so that's that's the
12:59 importance of scam lines but since this
13:00 image is just one pixel tall we only
13:02 have one here scam lines start with
13:04 something called a filter type which can
13:05 improve compression our image is so
13:07 small this is irrelevant so we'll use
13:08 filter type zero this is interesting
13:09 because the compression occurs per line
13:12 quick interjection turns out I wasn't
13:14 exactly right about this being line by
13:15 line it's not quite what I meant but to
13:17 be very clear the different filter
13:19 methods have entirely different
13:20 behaviors based on surrounding pixels or
13:22 not there's a long page in the w3c
13:25 standards about how all this works that
13:28 is almost as old as me I'm not going to
13:30 go any further into detail just know
13:32 that the filtering is a little more
13:33 complex than I initially said sorry
13:35 about that anyways back to much more
13:38 prepared Theo after the filter type each
13:40 pixel is encoded with one or more bits
13:41 depending on the bit depth in our case
13:43 we just need one bit per pixel recall
13:45 that we have a bit depth of one all
13:46 black or white it's also cool CU you can
13:48 have way more bit depth if you want to
13:49 do an HDR image or images with deeper
13:51 more Rich colors bit depth is an
13:53 important thing that a lot of people
13:55 don't know about is also why HDR is so
13:57 exciting because the bit depth available
13:58 to us if you've heard of 10 bit color in
14:00 monitors before this is what they're
14:02 talking about there are more bytes per
14:04 pixel which allow you to represent a
14:06 larger set of colors if your pixel data
14:09 doesn't line up with a bite boundary in
14:10 other words if it's not a multiple of
14:12 eight bits you pad the end of your scan
14:14 line with zeros this is true in our case
14:16 so we add seven padding bits to fill out
14:18 the bite putting that together the
14:20 single zero bit and the seven zero
14:22 padding bits this is the scam line now
14:24 we need to compress it second step
14:26 compression next we compress the scan
14:28 line data well not quite more accurately
14:30 we run it through a compression
14:31 algorithm most of the time compression
14:33 algorithms produce smaller outputs
14:35 that's the whole point but sometimes
14:37 compressing tiny inputs actually
14:38 produces bigger outputs because of some
14:40 small overhead unfortunately for us
14:42 that's what happens here but the PNG
14:44 file format makes us do it again as we
14:48 here the only PNG compression method
14:51 zero is defined by the international
14:52 standard you can't put any other value
14:54 here so you have to use this type of
14:56 compression if it is a valid PNG PNG
14:59 image data is encoded in the ZB format
15:00 using deflate compression algorithms
15:02 deflates also use Gip and zip to very
15:04 popular compression formats I won't go
15:05 in depth on deflate here but here oh wow
15:08 look at that we both had the same Vibe
15:09 but here's what our chunk data contains
15:11 the zip header which is two bytes one
15:13 compressed deflate block that encodes
15:15 two literal zeros four bytes and then
15:16 the zib check sum this is separate from
15:18 the PNG chunk check sum which is four
15:20 bytes so this is to make sure once
15:21 you've deflated that that line still
15:24 passes your check sum for more on how
15:26 deflate Works check out an explanation
15:27 of the deflate algorithm interesting
15:29 it's kind of what I was looking for
15:30 before this looks like it would be very
15:32 fun this was posted in the 23rd of
15:34 August 97 I'm going to save this if
15:36 y'all want me to do a video all about
15:38 how image compression works let me know
15:41 in the comments and I might do just that
15:42 cuz this is very interesting article
15:44 altogether here are the 10 databytes
15:47 interesting and again unfortunate we had
15:49 to run our 2 by scan line through an
15:50 algorithm that made it five times bigger
15:52 but the PNG standard makes us do it and
15:55 with that we can compute the png's
15:56 checkm field and finish off the chunk so
15:59 a for the data length 4944 4154 for idat
16:02 7801 is the header for ZB the compressed
16:06 block the check sum for the block and
16:08 the check sum for the whole
16:11 chunk and just one more chunk to go
16:13 taking a final look at our checklist
16:15 before we have the end of image chunk
16:17 now let's look at the end and also the
16:19 interesting things at the end of the
16:20 article that he promised poetically pgs
16:23 end like they began with a small number
16:24 of constant bites iend is the final
16:27 chunk short for image trailer I don't
16:29 see how I end is short for image trailer
16:32 if someone wants to describe how the
16:34 word end the word trailer become the
16:37 same thing other than trailers being
16:40 behind the thing they're attached to let
16:41 me know but I don't think I end in
16:45 trailer are the same anyways the zero
16:48 length is encoded with four zeros and I
16:50 end is encoded with these four again we
16:52 know how they do that now because
16:54 there's no data in the Chunk we just
16:55 move on to the check sum because
16:57 everything else in the chunk is
16:58 constantly check sum is always the same
17:00 interesting and then we have the whole
17:01 trailer chunk the zero the I end and the
17:06 sum now our PNG is done beautiful it
17:09 starts with a classic PNG signature
17:11 follows up with a bit of metadata
17:12 compresses the pixel data and then signs
17:14 off with an empty chunk and that's how
17:15 we made the world's smallest
17:17 PNG or is it well here's the twist it's
17:21 a lot of Champions technically this is
17:23 tied for first place because there's a
17:26 lot of other things that would compress
17:27 to the same size as long as you encode
17:29 all of the pixel data in a single bite
17:31 we can tie for the world's smallest
17:33 PNG so this 8x1 black image is also 67
17:37 bytes but it's eight times larger
17:38 because again you have to use the whole
17:41 chunk that you've now compressed this
17:43 actually is really interesting how
17:44 unintuitively the compression built into
17:46 PNG actually results in files being
17:49 larger and because that has set this
17:51 minimum we now have the ability to make
17:53 things bigger without the size getting
17:55 larger with our 1 by one image recall at
17:58 7 bits were effectively wasted on
18:00 padding black pixel in the padding and
18:03 we just use all that padding nothing
18:04 really changes instead of adding more
18:06 pixels you can also add more color
18:08 resolution many gray colors can be
18:10 encoded in a single bite letting us tie
18:11 for first for example this 1ex one gray
18:13 pixel is also 67 bytes again this uses
18:16 up the whole B we have available unlike
18:18 the 1ex one image but if you're
18:19 interested in this topic my former
18:20 coworker published the biggest smallest
18:22 PNG this is actually the article I saw
18:24 that led me to seeing this article where
18:26 someone went really in-depth on the
18:31 compression side this was an interesting
18:33 article if you want to see somebody go
18:35 way deeper on this cool thing I'll have
18:38 this Linked In the description but I
18:39 want to focus on the more pointed
18:41 smallest pixel article really cool stuff
18:44 that this is being followed up on and
18:45 that we can go back and forth on just
18:48 making a single pixel image but it's
18:50 really interesting and with standards
18:51 that have been around for as long as PNG
18:53 it's important that we understand them
18:54 and make the things necessary for people
18:56 to not just get how this works works but
18:58 theoretically improve it and make better
19:01 future so to summarize PNG start with a
19:04 signature the rest of the file is made
19:06 up of chunks each chunk has all of this
19:08 stuff some chunks are always required
19:10 like the image header the smallest PNG
19:12 uses the minimum number of chunks in the
19:13 smallest possible data if you want to
19:15 learn more about pngs he actually made a
19:17 PNG chunk Explorer where you can upload
19:20 a PNG and it will break it down for you
19:22 let's find a random one here is the
19:25 going live PNG for my stream
19:28 and we can see the signature the header
19:32 the international text Data interesting
19:34 it has that the color profile physical
19:36 data and then all of these image data
19:38 chunks it's a pretty big image really
19:40 cool that this is a tool that just
19:42 exists so you can explore and see how
19:44 your images work interesting he also
19:46 made a single color image so this
19:48 programmatically generates a one color
19:50 PNG using the standard usually when I
19:53 think of R rendering pngs I'm thinking
19:54 of using a tool of some form to generate
19:57 a pixel map not programmatically create
20:00 chunks so it's really interesting
20:02 thinking of a PNG in this way rather
20:04 than as a Target actually as like code
20:06 itself because when I work with pngs I
20:08 am exporting to them I'm not writing
20:11 data into them if that makes sense so
20:13 this is a really interesting project
20:14 that you can only really do if you
20:17 understand the depth of how pgs are
20:18 encoded and work and finally he also
20:21 wrote about the largest possible PNG
20:23 there's no theoretical file size limit
20:24 but there is a maximum number of pixels
20:26 and many decoders impose limits as well
20:29 fascinating this is a phenomenal article
20:31 I liked this a lot huge shout out to
20:33 Evan for writing this article I love the
20:34 opportunity to talk all about anything
20:37 to do with AV and codex and this stuff
20:39 is where a lot of my time was spent back
20:40 when I worked at twitch and spend a bit
20:42 since I could talk about it in detail so
20:44 thank you for the opportunity hope
20:46 youall liked this if you want to hear me
20:47 talking more about EV stuff I'll pin a
20:48 video in the corner see you guys in the
20:50 next one peace nerds