Long-tail and rare event problems become crucial when autonomous driving algorithms are applied in the real world. For the purpose of evaluating systems in challenging settings, we propose a generative framework to create safety-critical scenarios for evaluating specific task algorithms. We first represent the traffic scenarios with a series of autoregressive building blocks and generate diverse scenarios by sampling from the joint distribution of these blocks. We then train the generative model as an agent (or a generator) to search the risky scenario parameters for a given driving algorithm. We treat the driving algorithm as an environment that returns high reward to the agent when a risky scenario is generated. The whole process is optimized by the policy gradient reinforcement learning method. Through the experiments conducted on several scenarios in the simulation, we demonstrate that the proposed framework generates safety-critical scenarios more efficiently than grid search or human design methods. Another advantage of this method is its adaptiveness to the routes and parameters.