RL_Interview_Qs Flashcards
(2.01) What is Reinforcement Learning?
(2.02) Can you explain the key components of a Reinforcement Learning problem: agent, environment, state, action, and reward?
(2.03) What is the difference between supervised learning, unsupervised learning, and reinforcement learning? supervised learning
(2.04) What are the two main types of reinforcement learning algorithms: model-based and model-free?
(2.05) What is the Markov Decision Porcess (MDP) and how is it related to Reinforcement Learning?
(2.06) Can you explain the concepts of exploration and exploitation in the context of RL?
(2.07) What is the difference between policy-based and value-based reinforcement learning methods?
(2.08) What is Q-Learning? How does it work?
(2.09) Can you describe the concept of discount factor (gamma) in RL and its purpose?
(2.10) What is the Bellman Equation? How is it used in reinforcement learning?
(2.11) What is the purpose of an epsilon-greedy strategy in RL?
(2.12) Can you briefly explain the Temporal Difference (TD) learning method?
(2.13) What are the main challenges of reinforcement leaarning?
(2.14) What are some popular of reinforcement learning in real-world scenarios?
(2.15) What is the role of the reward function in RL? Can you give an example?
(2.16) What is the Monte Carlo method in RL and when is it used?
(2.17) Can you explain the concept of state-value function (V) and action-value function (Q)?
(2.18) What is SARSA? How does it differ from Q-Learning?
(2.19) Can you provide an example of a continuous action space in RL?
(2.20) What is deep reinforcement learning? How does it combine deep learning and reinforcement learning?
(3.01) What are some common techniques to address the exploration-exploitation dilemma in RL?
(3.02) How does the concept of “credit assignment” apply to RL?
(3.03) What are the key differences between on-policy and off-policy learning in RL?
(3.04) Can you explain the concept of the “curse of dimensionality” in the context of RL and how it affects learning?