Changelog
⏰ TODO in Coming Versions¶
- [x] Faster and simpler evaluation pipeline
- [ ] Dynamic dataset
- [ ] More fine-grained datasets
- [ ] Chinese output evaluation
- [ ] Downstream application evaluation
Version 0.3.0¶
Release Date: 23rd Apr, 2024
- Support parallel retrieval of embeddings when evaluating AdvlInstruction
- Add exception handling for partial evaluations
- Fixed some bugs
- Add evaluation results for ChatGLM3, GLM-4, Mixtral, Llama3-8b, and Llama3-70b (check out)
Version 0.2.3 & 0.2.4¶
Release Date: March 2024
- Fixed some bugs
- Support Gemini API
Version 0.2.2¶
Release Date: 1st Feb, 2024
- Support awareness evaluation in our new work
- Support Zhipu API evaluation (GLM-4 & GLM-3-turbo)
Version 0.2.1¶
Release Date: 26th Jan, 2024
- Support LLMs in replicate and deepinfra
- Support easy pipeline for evaluation
- Support Azure OpenAI API
Version 0.2.0¶
Release Date: 20th Jan, 2024
- Add generation section (details)
- Support concurrency when using auto-evaluation
Version 0.1.0¶
Release Date: 10th Jan, 2024
We have released the first version of the TrustLLM assessment tool, which includes all the evaluation methods from our initial research paper.