The Literature Review Process (Part I)

Background

Upon starting graduate school, you’re faced with a deceptively hard question: “What should I choose for a research topic?” The possibilities are endless; that’s the first problem. When evaluating an idea there is also the question: “Is this a ‘worthy’ problem for a dissertation?”

The unfortunate truth is that identifying a good research problem is a skill you develop by doing research. I sympathize with the frustration in that irony; I experienced it myself in graduate school.

A solution I’ve found is to make the identification of a research topic systematic. It starts by defining a fairly broad area of interest. A “sieve” is then implemented on the related literature to classify research domains; another layer provides a restriction to a single domain; another identifies active research and open problems within the domain. The latter automatically highlights open and “worthy” problems—by virtue of the activity itself. A rank-ordering is then applied, from which I’ll help you select a specific research problem to focus on.

I refer to the process above as the first-semester literature review.

You can think of it as a machine for translating a general interest (e.g., “grid energy management”) into a tangible primary research question (e.g., “How can stochasticity be exploited in [specific system] to achieve optimal stochastic control?”) and secondary questions (e.g., “What are the sources of stochasticity in this type of system? What are their statistical properties?”).

The process also introduces practices that will help you organize and more efficiently conduct (see below) your research. Overall, the literature review serves three purposes:

  1. It helps narrow your focus to a specific problem domain, and uses the state-of-the-art and existing gaps within it to suggest tangible research problems;

  2. In the process, you’ll construct a bibliographic reference file (.bib), which will be used (and built-upon) for references in papers and your dissertation;

  3. The literature review narrative will be used in the introduction section of papers and eventually the second chapter of your dissertation.

All of these are meant to make your progression through graduate school more efficient, while teaching you techniques you can use throughout your career.

Identifying a Problem Domain

The first step is to take a broad survey of research in your area of interest. The basic question is: “What research is being done in this area?”

To answer this question, I recommend the following tools and search techniques.

IEEE Xplore. This archive is accessible on most university campuses and contains over 6 million papers, on a wide variety of specialized technical topics. Start by searching on your overall topic of interest (in quotes if it’s a phrase). Then, use the “sort” feature to order the papers by citation number. This typically results in the most high-quality papers and survey articles, which is an excellent starting point. After you select a paper, you can select “Cite This” and choose BibTeX to download the .bib entry. Don’t forgot to also download the .pdf file.

Research Rabbit. This application allows you to perform searches outside of (and including) IEEE, and can give a broader view. You can show a cluster-map of authors and topics; this can sometimes indicate sub-groups of activities (i.e., development stages) in the prior research. Search results include links to the location of the sources; however, not all of them will be freely-accessible. You can cull a group of papers associated with a search and then export a .bib file; however, be careful to check for repeated name entries (an apparent bug as of the time of this writing).

This high-level search will reveal how active the area is. A low level of activity for a student-defined area of interest is quite rare; let me know if this occurs.

For the majority of students, there will be active research—typically too much. In this case, the next step is to start identifying research domains; this is where survey papers and cluster-mapping in Research Rabbit will come in handy. It may require skimming abstracts and possibly complete papers.

After you’ve identified domains, determine if any of them pique your interest. If you have a clear favorite and want to investigate it further, continue to the next step. If several domains interest you, and you want my input to help you decide, let me know. If nothing piques your interest, you’ll have to consider a new or modified area of interest.

Example: Suppose your general area of interest is “grid energy management.” You find two broad groupings: deterministic methods and stochastic methods. Within the stochastic methods you identify: stochastic optimization and approximate dynamic programming. Suppose you also want to include battery storage and solar PV systems in your research. You might select as your domain: “stochastic optimization for energy management in grids with battery energy storage and solar PV.” Your domain search key word list might be:

[“stochastic optimization” “energy management” “grid” “storage” “PV”]

First-level Review | Total Paper Count

At this point, you have a target domain and associated key word list. Use the tools in the previous section to perform a refined search.

The next step is to perform what I’ll call the first-level review for the set of papers you found. Let’s define two collections: “high-quality” and “low-quality.” These collections will distinguish between papers you want to review thoroughly and will likely cite in future papers (high-quality) and those you’ll want to include in your bibliographic reference file, may need to cite, but won’t review thoroughly, at least not now (low-quality).

For each paper, do a high-level skim. Does it appear well-written, well-structured and thorough? Add it to your high-quality collection. Does it look only partially relevant, not as well-written, or have some other issue? Add it to the low-quality collection. If it doesn’t meet either of these, eliminate it from consideration.

Yes, there is subjectivity in the above classification; but you’ll find that many papers in the literature are just…bad papers. These papers should be ignored because they never should have been published [separate topic for another day].

How many papers should you include in your literature review, i.e., the number of papers assigned to the high-quality collection? I can’t give an an exact number because it depends on the maturity of the topic. But 40 is a reasonable total. A subset of these (say 3-5) will be the really “important” ones with regard to your research. If this sounds like a lot; remember that you’ll dedicate an entire semester to it.

I recommend a pacing strategy of completing 2 paper reviews per week to stay on track over the semester.