2.3.1 Sufficiency of training data

Assurance objective: The learned algorithm is trained to satisfy the safety requirements using data that is sufficiently representative of the RAS operating environment and operating scenarios.

Contextual description: The training data used must be sufficient to ensure that the trained algorithm will satisfy the defined safety requirements. This must include assurance that the training data provides sufficient coverage of all operating scenarios in the defined operating environment. At the same time, it must also be ensured that the learned algorithm does not become over-fitted to the training data resulting in lack of generalisation of the learning. This means that the algorithm should be shown to be robust, in that the performance of the algorithm with test data does not significantly deteriorate from the performance achieved with from the training data. Issues relating to the processing and classification of the training data are also important considerations.

Practical guidance: The ML algorithm may be trained through operation of the system itself, or may be trained on a simulator before integration into the target RAS. The guidance will discuss the assurance considerations associated with real-world and simulation-based training (or the use of a combination).

There are common challenges that may be encountered as part of the training process. As well as ensuring robustness, these include avoiding negative side effects, and avoiding ‘reward hacking’ and other potential problems* when using a reinforcement based learning approach. It is important that the training undertaken can be demonstrated to mitigate such problems.

Explainability can be important as part of training by identifying what has been learned, and thus ways to make the training data more effective.

 

* Amodei, D., Olah, C., Steinhardt, J., Christiano, P., Schulman, J. and Mané, D., 2016. Concrete problems in AI safety. arXiv preprint arXiv:1606.06565.