The NeurlPS 2024 Preshow NaturalBench Evaluating Vision Language Model on Natural Adversarial Sample