Under active development Content is continuously updated and improved

MEASURE-2.1Test sets, metrics, and details about the tools used during test, evaluation, validation, and verification (TEVV) are documented.

>Control Description

Test sets, metrics, and details about the tools used during test, evaluation, validation, and verification (TEVV) are documented.

>About

Documenting measurement approaches, test sets, metrics, processes and materials used, and associated details builds foundation upon which to build a valid, reliable measurement process. Documentation enables repeatability and consistency, and can enhance AI risk management decisions.

>Suggested Actions

  • Leverage existing industry best practices for transparency and documentation of all possible aspects of measurements. Examples include: data sheet for data sets, model cards
  • Regularly assess the effectiveness of tools used to document measurement approaches, test sets, metrics, processes and materials used
  • Update the tools as needed

>Documentation Guidance

Organizations can document the following

  • Given the purpose of this AI, what is an appropriate interval for checking whether it is still accurate, unbiased, explainable, etc.? What are the checks for this model?
  • To what extent has the entity documented the AI system’s development, testing methodology, metrics, and performance outcomes?

AI Transparency Resources

  • GAO-21-519SP - Artificial Intelligence: An Accountability Framework for Federal Agencies & Other Entities.
  • Artificial Intelligence Ethics Framework For The Intelligence Community.
  • WEF Companion to the Model AI Governance Framework- WEF - Companion to the Model AI Governance Framework, 2020.

>References

Emily M. Bender and Batya Friedman. “Data Statements for Natural Language Processing: Toward Mitigating System Bias and Enabling Better Science.” Transactions of the Association for Computational Linguistics 6 (2018): 587–604.

Margaret Mitchell, Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy Vasserman, Ben Hutchinson, Elena Spitzer, Inioluwa Deborah Raji, and Timnit Gebru. “Model Cards for Model Reporting.” FAT *19: Proceedings of the Conference on Fairness, Accountability, and Transparency, January 2019, 220–29.

IEEE Computer Society. “Software Engineering Body of Knowledge Version 3: IEEE Computer Society.” IEEE Computer Society.

IEEE. “IEEE-1012-2016: IEEE Standard for System, Software, and Hardware Verification and Validation.” IEEE Standards Association.

Board of Governors of the Federal Reserve System. “SR 11-7: Guidance on Model Risk Management.” April 4, 2011.

Abigail Z. Jacobs and Hanna Wallach. “Measurement and Fairness.” FAccT '21: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, March 2021, 375–85.

Jeanna Matthews, Bruce Hedin, Marc Canellas. Trustworthy Evidence for Trustworthy Technology: An Overview of Evidence for Assessing the Trustworthiness of Autonomous and Intelligent Systems. IEEE-USA, September 29 2022.

Roel Dobbe, Thomas Krendl Gilbert, and Yonatan Mintz. “Hard Choices in Artificial Intelligence.” Artificial Intelligence 300 (November 2021).

>AI Actors

TEVV

>Topics

TEVV
Documentation
Validity and Reliability

>Cross-Framework Mappings

>Relevant Technologies

Technology-specific guidance with authoritative sources and verification commands.

Ask AI

Configure your API key to use AI features.