N-Day-Bench is a benchmark designed to evaluate the ability of Large Language Models (LLMs) to identify known vulnerabilities in real-world codebases. It provides a standardized and challenging environment for assessing LLM performance in security-related tasks. The goal is to understand the current capabilities of LLMs in vulnerability detection and guide future research in this area.
See what users think about this app
Be the first to share your experience with this app and help others make informed decisions!
Sign in to write a review