Skip to content

Changelog

⏰ TODO in Coming Versions

  • [x] Faster and simpler evaluation pipeline
  • [ ] Dynamic dataset
  • [ ] More fine-grained datasets
  • [ ] Chinese output evaluation
  • [ ] Downstream application evaluation

Version 0.3.0

Release Date: 23rd Apr, 2024

  • Support parallel retrieval of embeddings when evaluating AdvlInstruction
  • Add exception handling for partial evaluations
  • Fixed some bugs
  • Add evaluation results for ChatGLM3, GLM-4, Mixtral, Llama3-8b, and Llama3-70b (check out)

Version 0.2.3 & 0.2.4

Release Date: March 2024

  • Fixed some bugs
  • Support Gemini API

Version 0.2.2

Release Date: 1st Feb, 2024

  • Support awareness evaluation in our new work
  • Support Zhipu API evaluation (GLM-4 & GLM-3-turbo)

Version 0.2.1

Release Date: 26th Jan, 2024

Version 0.2.0

Release Date: 20th Jan, 2024

  • Add generation section (details)
  • Support concurrency when using auto-evaluation

Version 0.1.0

Release Date: 10th Jan, 2024

We have released the first version of the TrustLLM assessment tool, which includes all the evaluation methods from our initial research paper.