Building a self-corrective coding assistant from scratch

LangChain2024-02-27

17K views|5 months ago

💫 Short Summary

The video explores code generation using flow engineering, focusing on Alpha Codium by Codium AI. It introduces Lang graph for building graphs to represent flows and outlines steps for execution. The speaker demonstrates calling a node in a graph to process questions using Lang chain documentation. A data model is discussed as a tool for outputting structured information. The process involves setting up a graph state, checking for errors, and generating solutions based on feedback. The use of Lang graph improves code execution success rates. LRAs are recommended for building reflective applications with feedback loops.

✨ Highlights

📊 Transcript

✦

Overview of code generation using flow engineering with a focus on Alpha Codium by Codium AI.

00:20

Alpha Codium involves generating and ranking solutions through testing, iterating, and improvement.

Introduction of Lang graph as a tool for building arbitrary graphs to represent different flows.

Lang graph allows for outlining steps in a specific order for execution.

Speaker plans to implement ideas from Alpha Codium using Lang graph to answer coding questions about Lang chain documentation.

✦

Demonstration of calling a node in a graph for question processing and answer generation using Lang chain expression language documentation.

03:20

Parsing components such as preamble, imports, and code with a pedantic object for formatting.

Implementation of tests to ensure imports are functional and code execution is correct.

Mention of a more advanced approach in Al codium for public coding challenges, with a focus on simplifying the process.

Showing how ideas can be implemented and made complex through a simplified process.

✦

Discussion on the creation of a data model tool for structured output.

06:18

The tool will perform a function call to output in a specified format, with a prompt stating 'here's all the cell docs'.

It can answer questions and structure output in various ways, like a pedantic object.

The tool extracts information such as imports, code, and prefixes from the object to generate structured output easily.

✦

The segment focuses on setting up a graph state using a dictionary with code solutions and errors.

10:14

Each node in the graph represents a function that modifies the state, with errors triggering a retry with feedback from previous attempts.

The process involves reflection and generating a new code solution based on feedback received during retry attempts.

Incrementing iterations and tracking retry attempts play a crucial role in the solution generation process.

✦

Process of checking for imports and code execution errors.

12:22

Creating a new key error in the dictionary to identify errors and maintaining a set of errors for multiple iterations.

Decision point on whether to proceed with code execution or revert back, based on the presence of errors.

Highlighting conditional edges guiding the flow based on the output state, directing to the next node or back to the generate node.

✦

The video segment explains the iteration process for creating a workflow, including setting up nodes and edges based on code execution outputs.

15:13

The presenter emphasizes following a diagram to add nodes and build a graph accordingly.

The logic for transitioning to the next node is determined by the function's output, with clear instructions on how to map the workflow.

A question is posed regarding processing text using line transpression language, showcasing a practical application within the workflow creation.

The segment concludes with running a graph to visualize the steps and outcomes.

✦

The importance of error recovery and reflection in code execution.

19:10

An error initially occurred due to unsupported types, dict, and string.

A retry approach was made, passing the error to the LM for a solution.

The retry led to a functional final solution, showcasing the significance of reflection in error recovery.

The segment highlights the ability to recover from errors through retry loops and understanding key concepts in code execution.

✦

The impact of using Lang graph on code execution performance.

22:40

Lang graph usage increased success rate from 55% to 80%.

Retry and reflection techniques with Lang graph improved performance by almost 50%.

Simple idea can be easily implemented and tested using provided notebooks.

Results demonstrate the power of incorporating simple checks and reflection in code generation to enhance performance, as shown in the alpha codium paper.

✦

Benefits of using LRAs for building reflective applications with feedback loops.

23:40

Upcoming release of a blog and significant performance improvements highlighted.

Viewers encouraged to experiment with provided code and provide feedback on their experiences.

00:01hi this is Lan from Lang chain I want to

00:04talk about using Lang graph for code

00:06Generation Um so co- generation is one

00:09of the really interesting applications

00:10of llms like we've seen projects like

00:13GitHub co-pilot become extremely popular

00:16um and a few weeks ago a paper came out

00:20um by the folks at codium AI called

00:23Alpha codium and this was really cool

00:26paper in particular because it

00:28introduced this idea

00:30of doing code generation using what you

00:34can think of as like flow engineering so

00:37instead of just like an llm a coding

00:41prompt like solve this problem and a

00:43solution what it does is it generates a

00:47set of solutions ranks them so that's

00:50fine that's like kind of standard like

00:52kind of prompt response style flow but

00:56what it does here that I want to draw

00:58your attention to is if it actually

01:00tests that code in a few different ways

01:03on public tests and on AI generated

01:06tests and the key point is this it

01:08actually iterates and tries to improve

01:11the solution based upon those test

01:12results so that was really

01:15interesting and a tweet came out by

01:18karpathy on this theme which kind of

01:22mentions hey this idea of flow

01:24engineering is a really nice Paradigm

01:26moving away from just like naive prompt

01:28answer to

01:30flow where you can build up an answer

01:32itely over time using

01:35testing so it's a really nice idea and

01:39what's kind of cool is a few weeks ago

01:42we introduced Lang graph as a way to

01:44build kind of arbitrary graphs which can

01:46represent different kinds of flows and

01:49I've done some videos on this previously

01:51talking about Lang graph or things like

01:53rag where like you can do retrieval and

01:55then you can do like a retrieval quality

01:57check like grade the documents if

01:59they're not good you can like try to

02:02retrieve again or you can like do a web

02:04search or something but it's a way to

02:06represent arbitrary kind of logical

02:08flows with

02:10llms in a lot of the same way we do with

02:12agents but the benefit of graphs is that

02:14you can outline a flow that's a little

02:16bit more it's kind of like an agent with

02:18guardrails it's like you define the

02:21steps in a very particular order and

02:24every time you run the graph it just

02:25executes in that

02:27order so what I want to do is I want to

02:30try to implement some of the ideas from

02:31alpha codium using L graph and we're

02:35going to do that right now so in

02:38particular let's say we want to answer

02:40coding questions about some part of the

02:43Lang chain documentation and for this

02:45I'm going to choose the L chain

02:47expression language docs so it's a

02:49subset of our docs it's around 60,000

02:52tokens and it focuses only on line chain

02:55expression language which is basically a

02:56way you can represent chains using

02:58inline chain and we'll talk about that

03:00in a little

03:01bit but I want to do a few simple things

03:04so I want to have one what we're going

03:07to call a node in our graph that takes a

03:09question and outputs an answer using

03:12Lang chain expression language docs as a

03:15reference and then with that answer I

03:19want to be able to parse out components

03:22so given the answer I want to be able to

03:23parse out like the Preamble what is this

03:26answering the import specifically and

03:29then the code and to do this I want to

03:32use some like a pedantic object so it's

03:34like very nicely

03:36formatted if I have that I can really

03:39easily Implement tests for things like

03:42check to make sure the Imports work

03:45check to make sure the code executes and

03:47if either of those fail I can loop back

03:50to my generation node and say Hey try

03:52again here's the error Trace so again

03:55what they're doing in Al codium is way

03:57more sophisticated I don't mean to

03:58suggest we're iing imple M this as is um

04:01this actually works on like a bunch of

04:03public coding challenges it actually has

04:06tests um for each question that are both

04:08add and publicly available so again

04:13we're doing something much simpler but I

04:14want to show how you can Implement these

04:16kinds of ideas and you can make it

04:18arbitrarily complex if you

04:20want so I'm going to copy over some code

04:24into a notebook that I have running and

04:26all I've done is I've just done some pip

04:27installs and I've BAS to find a few

04:30environment variables for Lang Smith

04:32which we'll see later is pretty

04:34useful and I'm going to call this

04:36docs so this is where I'm going to

04:38ingest the docs related to Lang

04:40expression language and I'm going to

04:43kick off uh this right now so that's

04:47running so again this is using a URL

04:49loader grab all the docs sort them and

04:51clean them a little bit and here we go

04:54so here we go these are all the docs

04:56related to Lang and expression language

04:58it's around 60,000 token tokens I've

05:00measured it in the past so there's our

05:03docs now I I want to show you something

05:05that's very useful um I'll call it tool

05:09use um with open ey models and and other

05:12LMS have similar functionality but I

05:14want to show you something that's really

05:16useful um what I'm going to do here is

05:19show how to build a chain that will

05:23output remember we talked about in our

05:25diagram we want three things for every

05:27solution we want a preamble we want want

05:29Imports we want code as a structured

05:32object that we can like work with

05:33individually I'll show you right here

05:35how to do that so we're doing is we're

05:38importing um uh from pantic this base

05:42model and field and we're defining a

05:43data model for our output so I want a

05:46prefix which is just like the plain

05:48language solution like here's the setup

05:50to the problem the import statement and

05:53the code I want those as three distinct

05:54things that I can work with

05:56later I'll use dpd4 uh1 25 say 128

06:00context window model um and what I'm

06:04going to do is I'm going to take this

06:06this data model I'm going to turn it

06:07into a tool and I'm going to bind it to

06:10my model and so basically What's

06:12Happening Here is it's going to always

06:14perform a function call to attempt to

06:15Output in this format I specify here

06:18that's all it's happening I Define a

06:21prompt that says here's all the L cell

06:23they're Lang CH expression language

06:25pronounced or or substituted as LC

06:29here's all LCL docs answer the questions

06:32structure your output in a few ways but

06:34what's cool is we're always growing that

06:36function call to basically try to Output

06:38a pantic object so there we go now

06:41what's nice is I can just invoke this

06:44with a question so let's just try

06:47that so I'm going to say

06:50question and I'm going to say how to

06:54create a rag chain in NLC we want to

07:00run

07:02um okay this needs to be a dict there we

07:05go boom so that's

07:08running now this is actually just we can

07:11see right here we passed in all those

07:14docs that we previously loaded so it's

07:16like 60,000 tokens of context and again

07:19you think about newer long context llms

07:21like Gemini that becomes more more

07:23feasible to do do things like this take

07:26like a whole a whole code base a whole

07:28set of documentation load it and just

07:29stuff it into a model and have it say

07:32hey answer questions about this that's

07:34still running now the latency is

07:36definitely higher because it's very very

07:37large context but that's fine we have a

07:40little bit of time and we can go over to

07:42Lang Smith Al this is running and have a

07:43look so we can see here was our prompt

07:46okay so there you go look at this 63,000

07:49tokens you can see it's a lot of context

07:51um that's fine and we can actually see

07:53it all here so it's in Langs Smith um we

07:56don't want to scroll through all that

07:57mess but you can see we've asked a

08:00question we're grounding the response in

08:02all this L cell docs and we're going to

08:05hopefully output the response as a

08:07pedantic object that we can play with so

08:09let's just see and okay nice it's done

08:12so you can see our object here has a

08:15prefix um and it actually has um it's

08:21also going to have our Imports here as

08:23well we can actually can see that in

08:26lsmith uh the answer is going to be here

08:29and there you go see your Imports your

08:32code and your prefix and these can all

08:34be extracted from that object uh really

08:37easily um so it's basically a list and

08:39it's a pantic object code and you can

08:41extract each one just like Co you know

08:44answer. prefix answer do uh whatever our

08:47variables or whatever our keys are

08:49answer. Imports answer. code so that's

08:51great so that just shows you how tool

08:54use Works um and how we can get the

08:56structured output out of our generation

08:58node

09:00now what I'm going to do here is now

09:03that we've established that we can do

09:04that I'm going to start setting up our

09:07graph and what I'm going to do is first

09:09I'm going to find our graph state so

09:11this is just going to be a dictionary

09:13which contains things relevant to our

09:14problem it'll contain our code solution

09:16it'll contain any errors and that's all

09:19we're going to

09:20need and here is where this is all the

09:23code related to My Graph and we're going

09:25to walk through this so don't worry too

09:26much but I just want to kind of get this

09:28all here

09:29so here's our code now if we go up the

09:32way to think about this is simply this

09:35um I want to go back to my diagram here

09:38so every node in our graph just has a

09:40corresponding function and that function

09:43modifies the state in some way so what's

09:45happening is our generation node is

09:48going to be working with question and

09:51iteration those are the parts of state

09:53that we want as like inputs you can see

09:55it kind of maps to here you have

09:56question and you have iteration just

09:58counts how many times you've tried this

10:00we'll see why that's interesting

10:02later um this is exactly what we saw

10:06before data model llm tool

10:10use all the same stuff right template

10:13now here's where it's

10:14interesting if our state contains an

10:17error this error key what that means is

10:21we've fed back from some of our tests

10:24and we have an error that's already been

10:25generated so we're retrying here's why

10:29interesting if we're

10:31retrying what we're going to do is

10:33append our prompt just like we saw above

10:35we're going to add something to our

10:36prompt that says hey you tried this

10:38before here was your solution we saved

10:41that as generation key um and in our

10:46states you can see it's right here code

10:47solution generation here is your error

10:51please retry to answer this so it's kind

10:52of like inducing a reflection based on

10:55your prior

10:57generation and

10:59error and retry so that's a very

11:02important point because basically gives

11:03us that feedback from if there's a

11:05mistake and either the Imports or the

11:07executions we're feeding that back to

11:08generation and generation is going to

11:10retry with that information present so

11:12that's that's all that's happening there

11:15um and we're basically adding that to

11:16the prompt um and we're then invoking

11:19the chain with that error and then we're

11:20getting a new code solution so again

11:22that's if errors in our state dick if it

11:26isn't then we're going to go ahead and

11:28generate our solution just like we did

11:30above same thing so

11:33easy um one little thing is every time

11:37we return the the basically we're going

11:39to rewrite that output to the state

11:41we're going to increment our iterations

11:42and say Here's how many times we've

11:44tried to answer this question that's

11:45really it and you can see that's all we

11:48do return the generation return the

11:50question return the number of iterations

11:54easy now here's what's kind of nice we

11:56talked about having these two import

11:58checks the check for imports to check

11:59for execution let's our check import

12:02node just going to be really simple we

12:04have our

12:05solution from the solution we can get

12:07the Imports out just like we showed

12:09above this code solution Imports is from

12:11our pantic object um I'll move it over

12:14so you can kind of see so a pantic

12:16object has Imports we can get the

12:18Imports and all we do is just attempt to

12:22execute the Imports if it fails we alert

12:25hey import check failed and here's the

12:28key point we're just going to create a

12:29new key uh error in our dict identifying

12:34that hey there's an error present um

12:37something failed here and you'll see

12:40we're going to use that later now one

12:41other little trick if there was a prior

12:44error in our state we're just going to

12:46pend it so we do want to kind of

12:47maintain that um if there's like an

12:49accumulation of errors as we run

12:51multiple iterations we want to keep

12:53accumulating them so we don't like

12:55revert and make the same mistake we

12:57already did on a future iteration so

12:59we're going to maintain our set of

13:01Errors now if there's no error here then

13:04we're going to rewrite none so we're

13:06going to say we're good keep going uh

13:09and basically the same thing with code

13:10execution right in that case we're just

13:12extracting our code and our Imports we

13:15create a code block of imports Plus Code

13:17try to execute it again if it fails

13:21write our error and append all prior

13:23errors if it doesn't return none that's

13:26it that's all you really need to know

13:29now here note that we're going to do two

13:31kind of gates so we want to know if did

13:33either of those tests fail and again all

13:36we need to do is we can uh grab our

13:39error and remember if there is no error

13:44then if error is none keep going so here

13:47we're at the code execution like

13:50decision point so do you want to go to

13:51code execution or do you want to like

13:53revert back and retry so you can see

13:56here if there's no error when we get to

13:58the this point um then because we've

14:01done our import check if there's no

14:03error there keep going go to code

14:05execution we can see we return this node

14:07we want to go to um and if there is an

14:10error we can say hey return to the

14:12generate node so really what these

14:14functions do so these are conditional

14:17edges what these do is they do some kind

14:20of conditional check based upon like our

14:22output state so again if there's an

14:25error or not if there's no error it

14:28tells you go to this node if there is an

14:31error it tells you go back to generate

14:33node that's it same deal with deciding

14:35to finish again if there's no error and

14:39now here's the iteration thing for the

14:42sake of Simplicity what I say is give it

14:45three tries I don't want it to run

14:47arbitrarily long uh if there's no error

14:50or if you try three times just end

14:53that's it uh otherwise go back to

14:56generate so again same kind of thing

14:58decide to finished based upon um yeah

15:02based upon whether or not there's an

15:04error in our code execution or not

15:06that's really it that's all we're doing

15:08so we can go down we already grabbed all

15:11this now here is where we actually

15:13Define our what we call our

15:16workflow um and so this is actually

15:20where we defined all our edges and nodes

15:22as these functions and here's just where

15:23we kind of stitch them all together um

15:26so it's actually pretty straightforward

15:27it just follows exactly like the diagram

15:29we showed above um we like we're

15:31basically adding all of our nodes and

15:33we're building our graph following the

15:36diagram that we show so we can go back

15:39to the diagram so like you can kind of

15:41follow along right set your entry point

15:43it's generate add an edge generate check

15:47code Imports now our conditional Edge um

15:51so if we're going to decide to check

15:54code execution that was our function

15:56here right here

15:59so if um basically depending on the

16:03output here we can decide the next node

16:05to go to so um if the output of the

16:08function says check code

16:10execution we go to that node if the

16:12output says go to generate we go back to

16:14generate so these are where you specify

16:17the logic of the next node you want to

16:18go to and same here so that's all we do

16:22compile it done and just map to this

16:26diagram um kind of like one to one so

16:29that's actually pretty

16:31straightforward and there's just one

16:33little thing we now need to do uh we are

16:37going to go

16:38ahead and try a

16:41question so here's a question I've I've

16:44run a bunch of these tests already this

16:46is a question that seems kind of random

16:48but it's like we actually built a NE

16:49valve set and so it's a question that we

16:51we've sound that there's some problems

16:52with so I want to show you why this is

16:53pretty cool I'm passing it text Key Food

16:56in my prompt I want to process it with

16:57some function process text how do I do

17:00this using uh line transpression

17:02language so it's a weird question but

17:03you'll see why it's kind of fun in a

17:05short in a little bit and what I'm going

17:07to do is I'm just going to run my graph

17:09so what we can see because we print out

17:11what happens at every step we're can to

17:13kind of follow along and see what's

17:15happening

17:16here um so it's going to generate

17:18solution now we can see this may take a

17:20little bit because it's the same kind of

17:22long context generation that we saw

17:23previously so this is now running we can

17:26go to Lang Smith and we can actually

17:27just check this Lang graph and we can

17:29see okay so it's loading up and we're at

17:32generate so it's actually doing this

17:34generation this is still pending here is

17:36all our input docs so you can see that

17:38um that uh you know we passed this very

17:42large context to our LM uh so that's

17:45cool okay so here's this is interesting

17:48so what's happening is it's going

17:49through some checks so um the code

17:52import check worked decide to check code

17:55execution a decision testing code

17:57execution

17:59here's an interesting one code block

18:01check

18:02failed um decision retry so it's

18:07actually doing a

18:08regeneration

18:10so okay let's see it looks like it came

18:16to an answer um let's actually go and

18:21look at what happened in our Lang graph

18:23to kind of understand what happened so

18:26what happened if we look at the

18:31um let's look at when we

18:35attempted yeah exactly so here let me

18:39actually pull up the error

18:42here

18:44um here was our

18:47response um and what I want to show you

18:51is the error that we appended to our

18:55prompt

18:57um

18:59and we can actually make this a bit

19:00faster we can

19:02scroll this is the Crux of what I want

19:04to show you um okay here it is so what's

19:10cool is our initial attempt to solve

19:13this problem introduced an error there

19:16was an execution error it unsupported

19:19Opera for types dict and string so

19:21basically it did something wrong and we

19:25passed that in the prompt to the llm

19:29when it performs this retry so the our

19:31initial solution was here and it had a a

19:34coding error as noted but here you can

19:39see we provide that error and we say

19:42please try to reans this structure you

19:44know like the same instructions before

19:46here was uh here was the the question

19:50and we can see okay so this is actually

19:54the test of code execution which now

19:56works we can see previously when we

19:59tried this uh it fails and this was the

20:02error that error was passed along in the

20:04prompt like we just saw the new the new

20:07test indeed Works our final solution is

20:11functional

20:12code that's it so you can kind of get

20:14some intuition for the fact that when

20:16you have this retry Lube you can recover

20:18from errors using a little bit of

20:20reflection that's really the big idea um

20:23and again you get your answer out

20:25here um and so there's a bunch of keys

20:29and we don't necessarily I'll show you

20:32quickly uh keys and then we can just

20:35look at the generation

20:37key um

20:39cool and it's going to be a list so

20:42let's just break it out so there it is

20:43there's our code object we can see

20:46prefix okay so there's the prefix

20:51import uh and let's try code and hey

20:56let's just convince ourselves this

20:57actually works so we can just run exe

20:59Imports that works

21:02ex

21:04exec the code and this should work it's

21:08doing something there it tells a joke

21:10great um so this is pretty cool it

21:13initially when to try to answer this

21:14question produce an error and it then

21:17retry by passing that error back to the

21:19context just like we outlined in our

21:22graph and on the second try it gets it

21:24correct so that's nice it's a good

21:26example of how you can do this feedback

21:27and and reflection stuff now we actually

21:31have done quite a bit more work on this

21:34so I built an eval set of 20 Questions

21:36related to Lang and expression language

21:39and evaluated them all uh using this

21:43approach relative to not using Lang

21:46graph and here's the results I want to

21:48kind of draw your attention to this

21:49because it's a pretty interesting result

21:51for the import check without Lang graph

21:54versus with Lang graph it's about the

21:56same so Imports weren't really a problem

21:59before this like retry reflection stuff

22:02Imports were okay on oural set of 20

22:05questions I should make a note we

22:07actually ran this uh this is showing

22:09standard errors we ran this four times

22:11and so I basically accumulate the

22:12results I compute standard errors you

22:13can see that there's there's some degree

22:15of statistical reasonable inness to

22:17these results um in any case import

22:20checks were were kind of fine without it

22:23but here's a big difference there's a

22:25big difference in our code execution uh

22:27per performance with and without land

22:29graph so before land graph if you just

22:30try like kind of single shot answer

22:32generation a lot of the times this was

22:34like a 55% success rate many of the

22:37cases we saw code execution fail but

22:40with Lang graph with this kind of retry

22:42and reflection stuff we actually can see

22:45that the the success rate goes up to

22:47around I believe it was 80% so it's

22:50around like a almost a 50% Improvement

22:52in performance um with and without using

22:55L graph so that was actually really

22:57impressive and and it just shows the

22:58power of like a very simple idea um

23:01attempting code generation with these

23:03kinds of like just very simple checks

23:06and reflection can significantly bump up

23:08your performance and again the alpha

23:11codium paper shows this in like a very

23:12sophisticated context but what's cool is

23:15this is like a very simple idea you can

23:17imp Implement yourself in not much time

23:20um and we have this all available as a

23:22notebook you can run this on any piece

23:25of code you want so just take whatever

23:27documents want here Plum them in and you

23:29can run this and you can test this out

23:31for yourself but I've been really

23:33impressed I think it's pretty cool um

23:36and in general I think lra's a really

23:38nice way to build these kind of like

23:40reflective or self-reflective

23:41applications where you can build these

23:43feedback loops to you can do a check if

23:47the check fails try again with that Fe

23:50with that uh feedback

23:52present um in the retry and I'll just

23:56show you we have a Blog coming out I'm

23:57not sure there's anything else in that

23:58blog I haven't already showed you

24:02um yeah not nothing really to highlight

24:05this was our results again um this is

24:07maybe a little bit clearer to see um but

24:10again pretty significant Improvement in

24:12performance simple idea uh I definitely

24:15encourage you to experiment with this um

24:17and of course all this code will be

24:18available for you so um uh you know feel

24:21free to experiment and let us know how

24:22it goes thank

24:24you

🎥 Related Videos

Hip Pain RELIEF! Stretches And Exercises For Lateral Hip Pain

Jenkins-Day-1|| Introduction to Continuous Integration and Continuous Deployment

How to Build the Life You Want: Timeless Wisdom for More Happiness & Purpose

Discovery 臺灣部落寶藏第1季信仰祭儀

Lessons Learned at Coursera – Expert Roundtable with Julia Stiglitz

Reality Transurfing Chapter 14 "Forward to the Past" by Vadim Zeland

🔥 Recently Summarized Examples

4 Steps to Master Any Complex Skill (quickly)

40 Years of Fitness Experience in Less Than 11 Minutes.

Gun Controlling Media Makes FATAL Mistake... They Have Tied Their Fate To Biden's & Gun Rights Win

GET READY! Palantir Is Officially The Next Nvidia.

Abundant Thinking: The Hidden Key to Get Everything You Want (Audiobook)

The Coming Demonic Invasion (Revelation 9:12–21)

View original video