AI World Cup (2019)
Deep Q-Network for AI Soccer, Curie Kim, Yewon Hwang, Jong-Hwan Kim
Accepted to The 10th International Conference on Robot Intelligence Technology and Application (RiTA 2022).
- Ranked in Top 16 at the WCG 2019 Xi’an AI Masters.
- [Arxiv]
Competition Video
Our team is CY95!
Simulation Environment
We used the AI Soccer environment proposed by Hong, C.1 (available at here) for the experiment. AI Soccer is a 5:5 robot soccer game where each team is composed of one goalkeeper, two defenders, and two forwards. However, the role does not affect the robots’ capabilities: the only difference between each player is its specifications and initial position. The game is comprised of 5 minutes first half and 5 minutes second half. At the beginning of each half and after a team makes a goal, kick-off happens where only the second forward player of the ball owner’s team can move.
Our strategy
For state space, we define 22 states to train the agents. The robots’ coordinate and orientation values as well as the ball’s positions and orientations were provided through Webots throughout the game. All the positions were provided in meters in a Cartesian coordinate system, and the orientations were provided in radian. We first defined the x, y, and $\theta$ values of each player excluding the goalkeeper, and applied a regularization process to each value to avoid the risk of overfitting. Since there are two defense players and two forward players, 12 states are defined thus far. We also provided boolean information that indicates whether a player is active or inactive, in case a player is dismissed due to varying reasons for both forward and defense players as a state, which makes 16 states in total thus far. In addition, we provided 4 more states which are two x and y values of the ball that have undergone the regularization process. The reason we gave each x and y value twice is to give more weight to the values because we consider that information about the ball was the most critical information when training the agents. Lastly, we also provided x and y values of the ball after two frames from the current frame that have undergone the regularization process. The motivation behind the last two states is to train the agents so that they can predict the position of the ball in the next two frames and act accordingly in favor of our team. Therefore, there are 22 states in total as follows:
\[s = \{ 4_{(F1,F2, D1, D2)} \times (player\_x, player\_y, player\_\theta, is\_active), \\ 2 \times (ball\_x, ball\_y), (predicted\_ball\_x, predicted\_ball\_y) \}.\]-
Hong, C., Jeong, I., Vecchietti, L.F., Har, D., Kim, J.H.: Ai world cup: Robot-soccer-based competitions. IEEE Transactions on Games 13(4) (2021) 330–341 ↩