RAMA - REPOSITORY

REPOSTIORY INFO

            TITLE : 
Institut Teknologi Sepuluh Nopember Repository          
            
              URL : 

              http://repository.its.ac.id/
            
            SOFTWARE PLATFORM : 

            E-Prints
          
            TOTAL DOCUMENT :

            30003
          
          Implementasi Deep Reinforcement Learning pada Hexagonal Grid
Turn-Based Strategy Game

          Total View This Week0
          
                  Institusion
                  
                  Institut Teknologi Sepuluh Nopember                  

          Author
          
          Asqav, Dafa Fidini
          
                Subject
                
                GV1469.2 Computer games 
                
                Datestamp
                
                2023-07-03 02:28:25 
                
                Abstract :

                Game strategi merupakan permainan di mana pemain mengambil keputusan strategis di dalam game untuk menyelesaikan tujuan. Salah satu game tersebut merupakan Civilization VI (Civ6). Dalam game ini, pemain melakukan aksi bergilir dengan lawannya dalam area map yang tersusun dari lantai segi enam (hexagonal grid). Civ6 merupakan game dengan aspek 4X (Exploration, exploitation, expansion, extermination). Aspek-aspek 4X tersebut menyebabkan game ini menjadi kompleks. Hal ini menjadi tantangan bagi game developer untuk membuat lawan AI yang dapat memberikan tantangan yang cukup terhadap pemain. Akan tetapi, game strategi seperti Civ6 masih memiliki kapabilitas agen berbasis AI yang belum optimal. Berkembangnya bidang Deep Reinforcement Learning (DRL) menawarkan teknologi AI yang belum memungkinkan sebelumnya. Dalam penelitian ini, dirancang sebuah environment yang mengikuti mekanisme combat dalam Civ6 sebagai media implementasi DRL. Terdapat dua agen dalam environment ini: agen attacker dan defender. Kedua agen memiliki tujuan yang berbeda (asymmetrical) dan saling berlawanan (adversarial). Terdapat empat algoritma state of the art (SOTA) yang digunakan dalam eksperimen penelitian ini: Deep Q-Learning (DQN), Distributed Prioritized Experience Replay Deep Q-Networks (APE-X DQN), Proximal Policy Optimization (PPO), and Importance Weighted Actor-Learner Architecture (IMPALA). Dari hasil experimen, didapatkan bahwa APE-X DQN memiliki performa terbaik bagi agen attacker dan agen defender. Agen attacker APE-X DQN mampu menghancurkan kota secara konsisten sebelum 2 juta environment steps saat training. APE-X DQN juga merupakan algoritma yang memiliki performa paling baik saat evaluasi. Akan tetapi, beberapa generasi APE-X DQN tidak dapat dievaluasi pada evaluasi ke dua (Skenario environment 16x16). APE-X DQN menggunakan CPU dan RAM lebih banyak dari algoritma lain, dengan penggunaan CPU sebanyak 79.95% dan RAM sebanyak 82.3%.
================================================================================================================================
Strategy games are games where the player takes strategic decision in the game to finish an objective. One of these games is Civilization 6 (Civ6). In this game, the players
take turn in action in a map area made of hexagonal tiles (hexagonal grid). Civ6 is also a 4X game. These 4X aspects make the game rather complicated. This has become a
challenge for game developers to create AI opponents that are capable to give enough challenge for the players. However, strategy games like Civ6 are still having less than
optimal AI agent capability. The advancement in Deep Reinforcement Learning (DRL) allows AI technology that was not possible before. In this research, an environment that
follows the combat mechanics in Civ6 is devised as the DRL implementation media. There are two agents in this environment: attacker and defender. Both agent has differing
objectives (asymmetrical) and oppose each other (adversarial). There are four state of the art algorithms used in this research experiment: Deep Q-Learning (DQN), Distributed Prioritized Experience Replay Deep Q-Networks (APE-X DQN), Proximal Policy Optimization (PPO), and Importance Weighted Actor-Learner Architecture (IMPALA). From the experiment result, APE-X DQN performed the best for both attacker agent and defender agent. Attacker agent with APE-X DQN was able to consistently destroying the city before 2 million environment steps during training. During inter generation evaluation, APE-X DQN was also the best performing algorithm. However, some of the APE-X DQN generations failed to be evaluated on second evaluation stage (16x16 environment scenario). APE-X DQN required higher CPU and RAM utilization than other algorithms, with 79.95% CPU utilization and 82.3% RAM utilization. 

                                File :

                07211840000035_Undergraduate_Thesis.pdf
                
Download

                          book
                          BibTex
Latex, Jabref

                          cloud_download
                          Original Resource
url resource Institution

Institution Info

                  Institut Teknologi Sepuluh Nopember
TITLE : Institut Teknologi Sepuluh Nopember Repository

URL : http://repository.its.ac.id/

SOFTWARE PLATFORM : E-Prints

TOTAL DOCUMENT : 30003

TITLE :
Institut Teknologi Sepuluh Nopember Repository

URL :
http://repository.its.ac.id/

SOFTWARE PLATFORM :
E-Prints

TOTAL DOCUMENT :
30003