Flying an aircraft from takeoff to landing typically involves multiple tasks that take place in a large environment. Different areas in this environment have different effects on the behavior of the aircraft, for example, when flying at different altitudes with different speeds. In this large environment with continuous action space, fully observable states are not available due to, for example, atmosphere changes due to humidity but also limited equipment onboard. Within this large environment, besides flying typical tasks, accident prevention and recovery is essential for safety. All these flying conditions form a diverse task space and complex challenge that can be approached with deep reinforcement learning by using simulations. Existing research that uses reinforcement learning to learn to fly an aircraft or Unmanned Aerial Vehicles focuses usually on a subset of this task space. We propose a method based on on-policy reinforcement learning to cover a majority of flying tasks and potential accidents by leveraging the aerodynamics learned by the agent during simple tasks. To create these specific generally capable agents, safety is of higher importance for this problem compared to other applications. When the agent cannot participate in a task a possible flying path should still be found. An agent is trained using an on-policy framework with the main focus on understanding the dynamics in the environment on basic flying tasks. We show that this agent has high performance in zero shot and fine tuning on a diverse set of failures. Furthermore, we propose a system for assessing the probability of being in a certain time frame before being in an unsafe state.