Part three: why was FLAGPT making errors and should we blame the underlying mode?

News | Posted on Monday 13 May 2024

In the final part of his three part blog series, Research and Innovation Fellow Dr Kester Clegg discovers the problem is essentially that our system instructions are getting ignored by GPT-4.

This image was created using AI image generator Dashtoon Studio.

We can correct this, but we can only do so if we know about FLAGPT’s system instructions and what they contain (the user doesn’t). Telling users to type in ‘Follow your system instructions’ is obscure and opaque and not a viable solution. FLAGPT’s failure logic is better than GPT-4’s, but it still makes mistakes and there’s no way of knowing what the source of those errors is. Similarly, there is no way to predict a ‘drift’ from your desired alignment. Sometimes it works nearly perfectly, other times you have to correct almost every output. 

Should we blame GPT-4?

When the visualisation code defaults to LaTeX instead of TiKZ node declarations, then yes, this is GPT-4’s fault. 

When the system instructions on how to parse redundancies for failure logic get ignored, well, this is harder to know. Is GPT-4 overriding our GPT? 

Competing objectives? Difficult. Our system instructions say be brief because we want to reduce the probability of errors due to output token length, but we also want a detailed analysis. Can we have both?

Are all errors created equal?

Failure events conjoined by logic vary in significance. This means a qualitative assessment of failure logic analysis is difficult. 

For example, for the air conditioning packs: assuming AND failure logic could give you false belief in a redundancy that doesn’t exist. But assuming OR failure logic can give you the impression the system is much less robust than designed.

N.B. there’s often more than one way to model failure logic, especially when ‘explanatory placeholders’ are used in the fault tree to make it easier for people to understand them.

Conclusions

FLAGPT’s failure logic is better than GPT-4’s and it can visualise fault trees. It’s very far from perfect, but does work (eventually), provided you (the safety professional) correct it. The actual processing pipeline that FLAGPT can achieve is very fast. Yes, you could do it better, but you’d take three to seven  times as long! Sometimes GPT-4 overrules your GPT and you’ll get thrown a generic response. You won’t know why or when this will happen. FLAGPT has many of the flaws of its big brother GPT-4. It over-confidently gives incorrect logic and explanations. But what will GPT-x look like in five years' time?

All our code for this work is available for cloning or inspection on Github.

Contact Research and Innovation Fellow Dr Kester Clegg (kester.clegg@york.ac.uk) if you'd like more information about our work with LLMs and to talk through opportunities and projects around creating safer generative AI.