sinedpick@awful.systemstoTechTakes@awful.systems•Stubsack: weekly thread for sneers not worth an entire post, week ending Sunday 30 June 2024English
0·
5 months agoI tried using Claude 3.5 sonnet and … it’s actually not bad. Can someone please come up with a simple logic puzzle that it abysmally fails on so I can feel better? It passed the “nonsense river challenge” and the “how many sisters does the brother have” tests, both of which fooled gpt4.
Thanks for the suggestions. The LLM is free to use (for now) so I thought I’d poke it and see how much I should actually be paying attention to these things this time around.
Here are its answers. I can’t figure out how to share chats from this god-awful garbage UI so you’ll just have to trust me or try it yourself.
edit: I didn’t do any prompt engineering, just straight copy paste.