Post AXcdJkCvQNdeUrKezg by medecau@hachyderm.io
(DIR) More posts by medecau@hachyderm.io
(DIR) Post #AXcdJkCvQNdeUrKezg by medecau@hachyderm.io
2023-07-12T14:01:56Z
0 likes, 0 repeats
@simon just got got-4 with code interpreter to solve the sql-murder-mystery challenge. It needed a little bit of guidance but performed a lot better than the langchain plan-and-execute agent with gpt-4. (https://chat.openai.com/share/52b550ce-90ae-4ded-9364-2e1bdbdfa302)Would you be able to articulate what we're missing in order to get the same behaviour locally? There should be even less problems achieving the right answers if we are willing to let the agent operate without boundaries(network or others).
(DIR) Post #AXcdJksOwAocZUjmds by simon@fedi.simonwillison.net
2023-07-12T15:19:11Z
0 likes, 0 repeats
@medecau my hunch is that Code Interpreter uses a fine-turns model to get such great results - I'd love if if we could get equivalent quality from other models but I expect it would be pretty difficult
(DIR) Post #AXcknBhD09VZpNU3pw by medecau@hachyderm.io
2023-07-12T16:42:27Z
0 likes, 0 repeats
@simon I had in mind running similar environments, minus the limitations you mentioned on this recent podcast about it, but still using ‘gpt-4’
(DIR) Post #AXcmGYVjM5N2wGgNwe by simon@fedi.simonwillison.net
2023-07-12T17:00:05Z
0 likes, 0 repeats
@medecau I want to try that too, but I'm expecting it to be tricky because I don't think the model they use for Code Interpreter is the exact same model as the GPT-4 that's available via their APII might be wrong about that though! Plus with the right prompt engineering on top of OpenAI functions it may be possible to get great results anyway.