Design and implementation of three different agents

Assignment Help Computer Engineering
Reference no: EM132248724 , Length: word count:1000

Artificial Intelligence - Assessed Exercise

Problem Statement -

Your task is to design, implement, evaluate and document three virtual agents which are (potentially) able to reach a goal in a custom Open AI Gym environment derived from Frozen. Thus, you will need to install and understand the workings of the Open AI Gym environment to be able solve the task (hint: see AI (H) Lab 2).

The specific environment/problem under consideration is a grid-world with a starting position (S), obstacles (H) and a final goal (G). The task is to get from S to G. The environment is defined in uofgsocsai.py via the class LochLomondEnv including documentation relating to the setting, specific states, parameters etc. An example of how to instantiate the environment and navigate it using random actions is provided in lochlomond_demo.py.

You must consider three agent types: a senseless/random agent, a simple agent and a reinforcement agent based on the requirements listed in the following sections. Your agents will be tested against other in- stances of the same problem type, i.e., you cannot (successfully) hard-code the solution. You will have access to eighth specific training instances of the environment determined by the value of a single variable problem_id.

Your agents and findings should be documented in a short (max 1000 word) technical report accompanied by the actual implementation/code and evaluation scripts.

Tasks -

You should provide a design, implementation, evaluation and documentation of three different agents; each with its own set of specific requirements:

Task I: Senseless/Random agent

You should provide a solution for an agent without sensory input which takes random actions. You should initialise the environment as follows: env = LochLomondEnv(problem_id=problem_id, is_stochastic=True, reward_hole=0.0), where you can use problem_id ∈ [0:7] to evaluate the performance over different instances of the same problem.

Purpose: This agent should be used as a naive baseline. Hint: A basic senseless/random agent is already partly provided in lochlomond_demo.py (albeit with output computation of the performance measure...).

Requirements:

  • Sensors: None (/random/full; it doesn't matter...)
  • Action: Discrete
  • State-space: No prior knowledge (i.e. it has not got a map)
  • Rewards/goal: No prior knowledge (does not know where the goal is located)

Task II: Simple Agent -- A* Search

You should provide an agent based on a tree/graph-search and justify its use for solving the particular the problem (using A* search with a suitable heuristic) assuming that the task environment is fully known and observable. You should initialise the environment as follows: env = LochLomondEnv(problem_id=[0-7], is_stochastic=False, reward_hole=0.0), where you can use problem_id ∈ [0:7] to evaluate the performance over different instances of the same problem - and to fine-tune your agent to make sure it generalises. We recommend you use existing code (from e.g. the AIMA toolbox) to solve this part).

Purpose: This agent is used as an ideal baseline to find the optimal path under ideal circumstances. Hint: if you have attended the Lab sessions you will easily be able to reuse most of the code to solve this part (a parser that maps from env.desc to the lab 3 format will be made available).

Requirements:

  • Sensors: Oracle (i.e. you're allowed to read the location of all object in the environment e.g. using env.desc)
  • Actions: Discrete and noise-free.
  • State-space: Fully observable a priori.
  • Rewards/goal: Fully known a priori (you are allowed to inform the problem with the rewards and location of terminal states)

Key Task III: Reinforcement learning agent -- Q-Learning

You should provide a reinforcement learning agent to solve the problem with minimal assumptions about the environment (see below). The choice of the RL agent is up to you but we highly recommend using tabular Q-learning. Regardless, the choice should be well-justified and meet the list of requirements listed below. You should instantiate the environment as follows: env = LochLomondEnv(problem_id=[0-7], is_stochastic=True, reward_hole=[YOUR-CHOICE]), where you can use problem_id ∈ [0:7] to evaluate the performance over different instances of the problem - and to fine tune your agent to make sure it generalises. You are encouraged to write your own code but use of 3rd party implementations (e.g. the AIMA toolbox) is allowed, if you demonstrate a sufficient understanding of the algorithms in the accompanying documentation and cite appropriately.

Requirements:

  • Sensors: Perfect information about the current state and thus available actions in that state; no prior knowledge about the state-space in general
  • Action: Discrete and noisy. The requested action is only carried out correctly with a certain probability (as defined by uofgsocsai.py).
  • State-space: No prior knowledge, but partially observable/learn-able via the sensors/actions.
  • Rewards: No prior knowledge, but partially observable via sensors/actions.

Notice: You can restart and replay the same instance of the problem multiple times and maintain the knowledge obtained across repetitions/episodes. The reward should in this case be reported as;

a) the average across all restarts/repetitions and,

b) the single best policy (typically after a long learning phase).

Attachment:- Assignment File.rar

Reference no: EM132248724

Questions Cloud

What do you think will drive the changes : How have public health occupations in Saudi Arabia changed over the last 50 years? How will public health occupations look in the next 20 years in Saudi Arabia?
What changes should be made to reposition hoosier media : BUS-475 University of Phoenix , Based on your knowledge of the company, what changes should be made to reposition Hoosier Media competitively for the future?
Describe one strength and one limitation of the study : Describe one strength and one limitation of the study. Describe how the evidence from the article in part A informs current nursing practices.
Examine two common agile coach failure modes : Examine two (2) common agile coach failure modes. Provide at least two (2) examples of these failure modes.
Design and implementation of three different agents : Artificial Intelligence - Assessed Exercise. Provide a design, implementation, evaluation and documentation of three different agents
Describe capital structure : "We will need to determine the required return for our intended project so that we have a decision criteria defined for the project," she says.
Does the current system support the expenditures : Do the fees associated with expatriates inhibit healthcare organizations from attracting foreigners to fill staffing positions?
Collecting data for a strategic decision : What factors should be taken into consideration when collecting data for a strategic decision?
Analyze how the given security measures can be applied : Analyze how these security measures can be applied to the KSA healthcare initiatives currently under development in the Vision 2030 and Ministry of Health.

Reviews

len2248724

3/5/2019 10:23:23 PM

WORD LIMIT: 1000 Words. REFERENCING STYLE: Footnotes and bibliography. Your agents and findings should be documented in a short (max 1000 word) technical report accompanied by the actual implementation/code and evaluation scripts.

len2248724

3/5/2019 10:23:17 PM

Submission - You should include the following three items in your submission: Implementation & Code - Your implementation - containing the three different agents (along with any dependencies, except the Open AI Gym) - should be uploaded to Moodle as a zip-file containing the source code. Experiment & Evaluation - An important aspect of RL is assessing and comparing the performance of agents and different policies. To document the behavior of your agents you should design a suitable set of (computer) experiments which produces a relevant set of graphs/tables to document the behavior of your RL agent (e.g. average performance measure vs number of episodes, etc) and compares its performance against the baselines.

len2248724

3/5/2019 10:23:10 PM

Short Report (1000 words) - You should document your work results in a short technical report of (aim for 1000 words; excluding figures, tables, captions, references and appendices). Appendices may be used to provide extra information to support the data and arguments in the main document, e.g., detailed simulation results but should not provide crucial information required to understand the principle of the solution and the outcome. You can include as many references as you see fit. Requirements: Focus is at efficient, yet accurate and precise description of the implemented agents and learning algorithms in the given context of the task. We mark your understanding of these agents / learning algorithms and want to express the importance of their evaluation. You should The report should be submitted via Moodle as a pdf file alongside your implementation and evaluation scripts.

len2248724

3/5/2019 10:23:01 PM

Marking Scheme - The assessment is based on the degree to which your submission (implementation, evaluation script and report) concisely, correctly and completely addresses the following aspects: Analysis [15%] - Introduction/motivation and correct PEAS analysis (including task environment characterisation). Method/design [15%] - Presentation of relevant theory and methods (for all your agents) with proper use of citations and justification for choosing specific methods (referencing your analysis). Implementation [25%] - The code for all the agents (not in the report!) should be well-documented, follow best-practices in software development and follow the outlined naming convention. The report must contain a presentation of relevant aspects of the implementation.

len2248724

3/5/2019 10:22:53 PM

Experiments / Evaluation [20%] - Evaluation script/program to reproduce the results (i.e. graphs/tables) adhering to requirements. [20%] - Relevant presentation of the evaluation strategy, metrics and the obtained simulation results. A suitable presentation and comparison of the performance of the agent with other agents as evaluated across a suitable number of problem variations (e.g. using graphs/tables). Discussion and conclusion [5%] - Including a critical reflections on the outcome/results. The weighting of the senseless, simple and RL agent - for all marking criterion - is 5,10 and 85 %, respectively. So allocate your own time accordingly and remember that this exercise counts 25% of your overal grade in AI, so it is very important.

Write a Review

Computer Engineering Questions & Answers

  Mathematics in computing

Binary search tree, and postorder and preorder traversal Determine the shortest path in Graph

  Ict governance

ICT is defined as the term of Information and communication technologies, it is diverse set of technical tools and resources used by the government agencies to communicate and produce, circulate, store, and manage all information.

  Implementation of memory management

Assignment covers the following eight topics and explore the implementation of memory management, processes and threads.

  Realize business and organizational data storage

Realize business and organizational data storage and fast access times are much more important than they have ever been. Compare and contrast magnetic tapes, magnetic disks, optical discs

  What is the protocol overhead

What are the advantages of using a compiled language over an interpreted one? Under what circumstances would you select to use an interpreted language?

  Implementation of memory management

Paper describes about memory management. How memory is used in executing programs and its critical support for applications.

  Define open and closed loop control systems

Define open and closed loop cotrol systems.Explain difference between time varying and time invariant control system wth suitable example.

  Prepare a proposal to deploy windows server

Prepare a proposal to deploy Windows Server onto an existing network based on the provided scenario.

  Security policy document project

Analyze security requirements and develop a security policy

  Write a procedure that produces independent stack objects

Write a procedure (make-stack) that produces independent stack objects, using a message-passing style, e.g.

  Define a suitable functional unit

Define a suitable functional unit for a comparative study between two different types of paint.

  Calculate yield to maturity and bond prices

Calculate yield to maturity (YTM) and bond prices

Free Assignment Quote

Assured A++ Grade

Get guaranteed satisfaction & time on delivery in every assignment order you paid with us! We ensure premium quality solution document along with free turntin report!

All rights reserved! Copyrights ©2019-2020 ExpertsMind IT Educational Pvt Ltd