Patronus AI

company

Verified

https://patronus.ai

Activity Feed Request to join this org

AI & ML interests

LLM Evaluation

Recent Activity

DarshanDeshpande published a dataset 3 days ago

PatronusAI/world_model_corpus

DarshanDeshpande updated a dataset 3 days ago

PatronusAI/world_model_corpus

akkikiki authored a paper about 1 month ago

Diable: Efficient Dialogue State Tracking as Operations on Tables

View all activity

Papers

Benchmarking Reward Hack Detection in Code Environments via Contrastive Analysis

MEMTRACK: Evaluating Long-Term Memory and State Tracking in Multi-Platform Dynamic Agent Environments

View all Papers

PatronusAI 's datasets 39

PatronusAI/openai-gpt-4-turbo-covidqa-generations

Viewer • Updated Jul 9, 2024 • 1k • 26

PatronusAI/lynx-70b-instruct-ragtruth-generations

Viewer • Updated Jul 8, 2024 • 900 • 12

PatronusAI/lynx-70b-instruct-pubmedqa-generations

Viewer • Updated Jul 8, 2024 • 1k • 12

PatronusAI/lynx-70b-instruct-halueval-generations

Viewer • Updated Jul 8, 2024 • 10k • 11

PatronusAI/lynx-70b-instruct-financebench-generations

Viewer • Updated Jul 8, 2024 • 1k • 66

PatronusAI/lynx-70b-instruct-drop-generations

Viewer • Updated Jul 8, 2024 • 1k • 11

PatronusAI/lynx-70b-instruct-covidqa-generations

Viewer • Updated Jul 8, 2024 • 1k • 11

PatronusAI/drop-test

Viewer • Updated Jun 17, 2024 • 1k • 14

PatronusAI/financebench-test

Viewer • Updated Jun 17, 2024 • 1k • 79