Coding Challenge: Simple AI for Tic-Tac-Toe Game

This example relies heavily on the content of an excellent two part course made by Coursera and Rice University named “Principles of Computing with Python” that I took a few years ago. Except, this time I have used JavaScript and the solution has been built from ground-up: without provided classes or any kind of help. Additionally, I have experimented with a few variants of AI in order to check the effectiveness of proposed solution.

The main problem here is to make the computer player choose rational moves in a game of Tic Tac Toe. There are many solutions available on the internet (i. e. http://neverstopbuilding.com/minimax), but the one I have chosen was a simple, yet pretty effective implementation of Monte Carlo method.

Since this was a simple exercise developed in short time, there are many things that could be perfected, debugged, refactored, etc. Also, I’ve only allowed a Human player to play first. But, I wasn’t developing a game, rather I was exploring the effectiveness of proposed decision-making algorithm, and in order to do that I needed relatively decent user interface.

Basic technical stuff

First of all, the board is not represented as two dimensional, but rather one dimensional array. The reason is simple: single dimensional arrays are easier to deal with and CSS enabled me to display them as a 3×3 square matrix indexed from 0 to 8:

table

Then, eight winning patterns were defined (three rows + three columns + two diagonals): 0-3-6, 1-4-7, 2-5-8, 0-1-2, 3-4-5, 6-7-8, 0-4-8 and finally 6-4-2. As I have said, human player always starts first, and his moves are represented as H, while computer player is C. This makes the game sound more Hard Core than X vs O 😉

The AI Algorithm

I have implemented three variants of computer player’s decision making algorithm. First, there is a Naive Random mode, which just serves the purpose of showing how ineffective or stupid AI looks like. Its decisions are based on random selection of one of the available moves. And as expected, it sure is stupid – as you can see from the example below.

naive_ai
Naive Random Mode

Next, there is the main driver behind AI, the so called Smart Mode. This mode is based on what could be loosely described as Monte Carlo Method. In this mode, computer player is given a board state. He then saves this state and continues to play from that point on making random moves for human and computer player until the game is over.

In other words, the computer simulates random outcome of the game for a given board state. When the game is over, the result is taken into account and if it is a win the winner’s moves are scored with +1 weight factor, while the loser’s moves are scored with -1 weight factor. In case of a tie, the game is simply ignored. So the point is that AI now has the information about the most useful moves – the ones that are most strongly associated with winning outcome – so that it can chose the best one in a given context. These moves are kept in a separate table representation which keeps scores for every field of the board.

With this mode of decision making, the computer player is significantly better and smarter. That is to say, the AI uses opportunities for win well and it blocks the opponent relatively consistently (the algorithm is not ‘fatalistic’ as in the link provided above and I would say it is less computationally intensive). Although, it still makes obvious mistakes from time to time. I have discovered during the tests that Smart Mode is particularly vulnerable to ‘second column’ or ‘second row’ win patterns by human player, as you can see from the picture below.

Smart AI
Smart Random Mode

Since Smart Mode wasn’t smart enough, I tried to improve its effectiveness. This is what I call the Adaptive Mode, or simply put – this is the mode in which real human player win moves are scored as well as simulated ones. I thought that it would be interesting to try to improve/correct simulated moves’ scores with real intelligent user moves. That’s why I gave human moves more weight in scoring. The goal is to provide the AI an opportunity to learn from previous mistakes. The results are promising. Namely, since you should not give too much weight to human moves in order to avoid seriously biasing the results of Monte Carlo simulation, Adaptive Mode shows its advantages over time.

As I have noticed, Smart Mode is fairly consistent in terms of making moves – it does not variate much. On the contrary, Adaptive Mode slowly corrects itself and goes for optimal moves gradually. In the example below you can see how Adaptive move is first exactly the same as Smart move, since human moves are not accounted for. But over time, as it gains experience, it corrects its playing strategy and blocks human from winning the same way as before.

The thing here is to experiment with weight factors in order to get the optimal AI behavior.

adaptive_random_gameplay
Adaptive Mode

For those interested, you can pick up the source code (html/css/js) on my github at https://github.com/zkucekovic/Monte-Carlo-Tic-Tac-Toe. As always, your feedback is more than welcomed!

 

Graphing Political Proximity

Among the topics I am highly interested in are network representations of political and social structures. Examples below illustrate some of my recent work in this field. New content should be added occasionally. Please feel free to share your thoughts and impressions!

1. Proximity indicators in public political statements

screenshot
Political statements transformed into a graph

Project description

The purpose of this mini-project was to compile the dataset from local media sources, analyze the content of scraped news articles and code the statements of various political parties’ representatives about the possibility of mutual cooperation. Statements were classified into three categories: positive, negative and neutral. Time span of the study was two months.

Data was transformed and converted into network using NetworkX Python module. Finally, the graph was stylized using Gephi.

Numerical representations of proximity (edge weights) used to build the graph are:

ldk {'vv': -1, 'pdk': -11, 'aak': -3, 'nisma': -1, 'akr': 2, 'alternativa': 2}

akr {'pdk': -2, 'aak': -2, 'nisma': -2, 'vv': 2, 'alternativa': 1}

vv {'pdk': -2, 'ldk': 1, 'akr': 1, 'sl': -1, 'nisma': 1}

aak {'pdk': 0, 'nisma': 2, 'ldk': -1, 'vv': 1, 'sl': -1}

alternativa {'pdk': -2, 'aak': -1, 'nisma': -1}

pdk {'ldk': -3, 'nisma': 4, 'aak': 3, 'vv': -1}

nisma {'aak': 3, 'pdk': 0}

sl {'pdk': 1, 'ldk': 1, 'sl': -2}

The final graph depicts the structure of political relations based on the proximity of parties. Darker edges mean higher weights, while the numerical label of edge weight is colored from red (lower weight) to green (higher weight). Size of nodes represents outdegree (the bigger the size, the more statements the party gave) and the color of node represents indegree (the darker the color, the more statements are made about the party).

Results interpretation

Available data suggests that the political scene in Pristina is notably divided into two blocks which generally corresponds with pre-electoral coalitions. In case PDK-AAK-NISMA coalition fails to form a government, the graph suggests that there is a possibility of alignment among Vetëvendosje and some of the parties from LDK-AKR-Alternativa coalition (especially AKR), since they expressed more positive statements about each other and their proximity index is thus higher. Data also shows that Serbian List (SL) is facing cohesion problems, since its leaders issued negative statements about each other. Of course, eventual predictions of post-electoral dynamics based only on these indicators are not so useful, since public statements should not be interpreted as reliable clue of real political intentions. Contrary to that, they are just general contextual features of the current political setting.

2. Proximity indicators in voting patterns

parliament.png
Social Network Model of the Kosovo Assembly (2014-2016)
parliamentary_group_cohesion_example
Investigating proximity within a parliamentary group

Project description

Considering that cohesion is one of the most important measures when it comes to the exploration of intraparty and coalition dynamics, there is a need to model political proximity relations as accurately as possible. The graph shown above is based on “roll-call votes” (RCV), e.g. votes of MPs recorded by electronic system, which were later published on the website of the Kosovo Assembly. Even though these recordings comprise the largest number of voting activities of Kosovo MPs, we should notice that the present electronic system does not include all voting procedures of the Kosovo Assembly, primarily because MPs sometimes use voting methods other than the electronic system (secret ballot for example). However, these situations are not so often in the Kosovo Assembly, which makes RCV a reliable source for conducting research on Kosovo MPs voting behaviour. The time span of the research is December 2014 to December 2016, e. g. the first two years of Isa Mustafa’s Government.

Sampled RCV records included five types of different data on every bill or motion voted upon in the Kosovo Assembly: date of voting procedure, name of MP, her/his party affiliation, weather she/he was present at the voting, and her/his vote in three categories (“yes”, “no”, “abstain”).  This data was later transformed into “similarity matrix” which was used as a basis for the graph creation (again, using NetworkX module and Gephi).

Interpretation

In summary, what can be concluded is that voting behavior of Kosovo MPs reflected political dynamics of their parties, which could be seen in many examples of major political events. Also, two measures of cohesion (a kind of political proximity metric) of the parliamentary groups have been used in the research and one of them (voting agreement index) outperformed the other (social network modularity measure). Including datetime codes into the dataset enabled tracking of the changes in parliamentary groups cohesion, which most of the time faithfully reflected the actual political dynamics. Consequentially, the above graph generally corresponded well with the actual structure of the Parliament at the time and the divisions between the opposition (the part of the graph above the red line) and the ruling parties (below the line) are clearly visible at the highest level of aggregation.

Notice that this research is submitted for publication.