Tag Archives: online

Storing and archiving data


When I was doing my own PhD, I had a filing cabinet with three or four drawers, and even then I had hundreds of photocopies of academic papers stacked in small piles according to theme and relevance to the section that I was writing about next. My raw research data, however, was compactly contained in electronic format in the form of tables and graphs; row after row of numbers on spreadsheets which could be tabulated and correlated in any format that I desired. When I left the department, the files were archived for a few years, and then I suspect they were all dumped when the department moved to another building on another campus.

Now, when I generate research data, it is almost entirely in electronic format, and it is automatically stored in several places. I have my personal space in the memory banks of the university computing system, and this space is automatically backed-up overnight. I also usually back-up to my own cloud-space, so that I can access the data wherever and whenever I want. Usually, I also store data for individual projects on a separate memory stick or portable hard-drive. The digital age means that after two or three clicks, I can be assured that copies of my data are safely held in four or five independent locations. Research students can simultaneously share data with a colleague or supervisor in a different part of the world without even leaving their own desk.

This is only the tip of the iceberg, however, because the production of digital data raises almost as many questions as it provides innovative opportunities. There needs to be an early discussion in the supervisory team, for instance, about not simply which data will be stored, but where will it be stored, for how long, and who will have access to it? This is not simply an issue of security, although security, confidentiality, and appropriate use of the data will certainly figure in the discussion. There is a growing awareness that when public money is used to fund research, there needs to be a transparent return on public interest. Initially this has meant that research results, reports, and journal articles, should be made freely available to the public. This is being extended in the next Research Excellence Framework in the UK to insist that if the journal article is not already published as an open resource, it needs to be added as an open source on the digital repository of the relevant institution. But there’s more.

The argument has been extended to include the research data generated by the public funding, so the datasets themselves are trending to become open and shared property. Whether the data is numbers, interviews, audio recordings, photographs, or other recordable results, the likelihood is that the data being gathered by a researcher today, is probably going to be a shared resource tomorrow. It will be possible for other researchers, in subsequent years, to access your raw data, perhaps combine it with other raw data, and re-analyse, re-interpret, and publish their conclusions. It now begins to matter a great deal more seriously exactly who can gain access to your research data, and for what purposes. As the law currently stands, a bona fide researcher can have access to open datasets for up to ten years after they have been deposited. But here is the catch – if a researcher accesses this data after nine years, the open-access clock is automatically re-set for a further ten years. This ensures the certainty that data which is being collected and digitally stored just now, might be still openly available long after the initial researcher has moved on from that research topic, perhaps changed institutions, changed careers, maybe even passed away. The raw data of open access digital resources is now guaranteed a lifetime longer than the career-span of many individual researchers. So think carefully about what you gather, how you organise and store it, and what your legacy of research data will be!

What methods will help to answer the research question?


This is where it gets hard, not simply because the research student is venturing out into the unknown, but also because selecting the methods through which the research will be conducted will differ hugely between cultures, between disciplines, and between subjects within disciplines. There is no one-size-fits-all template which will allow a pick-and-choose approach to selecting the most appropriate methods. In one sense, this is an easy step, because it will probably be pretty obvious from the outset what methods will be needed in order to answer the research question(s). Almost all academic research methods will involve reading, either to follow-up on what has already been said about the topic or to put it into a wider context. After that, the methods might include interviews, experiments, observations, questionnaires, focus groups, and a host of other activities which will change in emphasis from discipline to discipline. Getting the “correct” mixture of these methods is what will determine the methodology, that is, the system of methods for further research.

Here is where high technology can come in. I say “high” technology because even using a pen-and-paper or driving a car to conduct an interview is using technology, but of course we generally mean computer-based technology. In educational circles you will frequently hear the assertion that “the technology should never lead!”. This is certainly true, to an extent, but not entirely. For instance, if there are two (or more) ways to record research data, and one way entails using a high-technology solution which makes it easier, more flexible and/or more secure, then surely most sensible people would vote for the use of the technology. Examples might include, the use of RefME to compile the dissertation reference list and store it on the cloud; using Mendeley to store the articles online; the use of SurveyMonkey to conduct a questionnaire online rather than face-to-face, giving time-flexibility, wider geographic coverage, and the ability to utilise automatic data analysis and presentation tools; the use of a free voice-recorder smartphone app to record interviews… The list could go on and on.

A crucial factor in all of this is to consider carefully – right at the start – how these methods will allow you to analyse and hopefully make sense of the data which will be gathered. It makes little sense jumping off a high-point without knowing, even approximately, where you might land. Similarly, it makes little sense to gather mountains of data without any ideas how to begin to make sense of it. The supervisor should be able to give some clear directions, but ultimately each situation, each carefully worded question, is slightly different, and will have different constraints on time, resources, and abilities, so the student will need to be fully comfortable with the methodology before even starting the research. Prior studies in a similar area can help to provide some direction, but the precise mixture needs to be decided for each individual research project.

Deciding the general direction of the research


By their very definition, PhD studies are seeking to untangle complex ideas and produce original thoughts on the subject matter, which is backed-up by a thorough examination of the evidence available. For this reason, deciding what the research student is actually seeking after is normally rather broad at first. When they start-out and get asked the question, “So, what is your PhD about?” the typical student will give a rather hesitant, half-page explanation. Ask this question again when they are on the point of completing the PhD and the reply is likely to be a very concise and quite specific, single sentence. The process of systematic research casts its net widely, then refines and re-focusses subsequent investigations to reinforce, or challenge, previous ideas and insights. Seeing the process as a little piece of a much larger, complex mosaic of ideas can be helpful, but a bit daunting.

To help the process of the distillation of knowledge, there are some basic techniques that any researcher can use. Firstly, it is wise to recognise that the PhD, as with almost any complex task, can be broken down into a number of smaller tasks, and that the role of the dissertation is to explain these tasks logically and clearly. In compiling the dissertation, the research student needs to effectively present the story of the research, from the introduction to the conclusions, in a way that makes it easy for the reader to understand what might be complicated and challenging issues. To make a start on this story-board, some people might like to utilise the concept of mind-maps to graphically link and make sense of the multitude of tasks that will be necessary to write about. Personally speaking, mind-maps do not really work for me. I prefer to construct a hierarchical list of all the possible sections and sub-sections. This has the advantage that such a list can very quickly be edited to provide the contents pages to the dissertation. For those who like diagrammatic checklists but struggle to find mid-maps useful, another way to help to identify the tasks that are required is to use software such as https://www.draw.io/ to create an easy-to-construct flow diagram which uses simple text and drag-and-drop shapes to (re)organise the sequence in which the research tasks need to “flow”.

Whatever planning style is adopted, and regardless of whether the research student starts with a question, a hypothesis, or simply a broad subject title, the aim of the research planning at this stage is to lay out with a broad brush the likely trend of the enquiry. Obviously the actual course of the research is likely to change tack several times during the PhD as new ideas emerge and light is thrown in some currently-dark corners, but the directional trend of the story, from the first sentence of the introduction to last sentence of the conclusions, should remain relatively constant. To some extent, it helps at this stage to be as specific as possible in the identification of each possible section and sub-section of the future research, but obviously this itemisation needs to be treated lightly so that it is flexible enough to change and modify. Treat it like a story-line which can be embellished or contracted as the research student’s knowledge of the topic deepens and extends. Like all good stories, there should be a beginning, a middle, and an end, with a path to link them up.

Description versus critical review


In constructing a literature review of any proposed research topic, especially for new arrivals to research, there is often a tension between giving a straight description of the relevant academic articles rather than providing a critical analysis. This is understandable. The main purpose of the literature review is to provide subsequent readers with an introduction to the subject area of the research, and this is done by constructing a narrative – a story – of the evolution of the subject area to the stage that we understand at present. This description describes the “landscape” of the research subject area – the significant and salient points and the less well-known or contested points. The literature review, however, needs to be more than just a simple description of each significant article, more than a sort of “He said… then she said…” list of opinions.

The literature review, to be really useful, needs to critically evaluate the importance of each article, as well as providing a description of what was said, what methods were used, what degree of reliability the data has, etc. The reader has not only to understand the history of the development of the research topic, but to appreciate the relative merits of previous work. This is relatively easy at the start of the project, but by the end, juggling several hundred citations, it becomes a challenge.

A number of students and colleagues have drawn to my attention an app called RefME which is a really interesting piece of software which enables the compilation of a reference list very quickly. Once a (free) account has been created on the app, entries of citations for books, journal articles, and lots of other artefacts can be added instantly by scanning the bar-code of the publication using a phone with the app. The reference list can be built-up and accessed from any device with a web connection. Reference lists can be divided into lists for particular projects (articles, conferences?) and each list can be exported to various formats, including a simple word document. Each citation can also be annotated, so using a simple set of phrases and tags, a critical reference list can be compiled in minutes. The app also allows citations to be input manually, which is required for older publications and those without a bar code. There are several “easy” referencing systems available at present, but the simplicity, elegance, and flexibility of this app really impresses me.

Whichever method is used to compile the reference list, there are two golden rules to adhere to. Firstly, start early to compile the reference list and keep on top of it. As an article or book is read, and if you know it is going to be referred to in the text of the dissertation, it should be immediately added to the reference list. Secondly, keep a list which is an annotated bibliography, not simply the list of all the references, but copy the file and add short notes on each reference. Do not trust the memory to remember details such as page numbers (for direct quotations) and DOI numbers (for direct web access), or even for the key points of analysis and critique. As the numbers of citations begin to mount, the details begin to blur and disappear. This will act as a memory jog, and also as a useful item to share with a supervisor to discuss the merits and demerits of individual articles. As time progresses, because they are focussed on one specific research topic, the PhD student will discover relevant articles which the supervisor(s) may not have seen, and anyway, there is life after the PhD so you might want some of this material again, years down the line. Don’t trust the memory!

Keeping track of articles


One of the key skills in any research project is good organisation. This is especially true for a PhD research project, lasting as they do over three years of full-time study, or up to seven years part-time. Students start off with two or three seminal articles relating to their research topic, but the field of reference will grow dramatically within the first six months, and citations will continue to be added to the reference list right up until the dissertation is submitted. Even then, the external examiner(s) might insist at the viva that the student needs to consider further a certain area of the research which will require further reading. Without a careful system, it does not take long for this growing pile of references to become unmanageable!

Some researchers swear by the old “traditional” system of individual index cards, alphabetically filed for each reference. This has the advantage of being able to add notes, summaries, questions etc., and also it is not dependent on technology, so does not require electricity or a battery. On the other hand, a file of cards is not very portable, can be a bit clumsy to sort, and not being digital, is less flexible to re-purpose. There are number of software packages, both free and commercial, that allow you to store and sort references on a computer. A product called Refworks provides an online database to manage bibliographic data, and this has numerous advantages, including being able to manipulate the data to display in different academic styles, create bibliographies for different publications, and also to access the data from different devices and locations. The university may subscribe to this product or some comparable service. Personally, I use a simple word processed file. This does not have the flexibility of customised bibliographic management software, but it has the advantage of being easy to create and use without specialised training. To create a bibliography for a new article I simply cut-and-paste from my master list (not forgetting to keep back-up copies of the master-list in other locations!)

In addition, Mendeley https://www.mendeley.com/ is a free manager for references and pdf documents which can be used to annotate articles and share online with students and other colleagues. It’s easy to use, see https://youtu.be/qRiAIaqdAOg and allows storage and access to a personalised library collection from any internet location. So, for example, a researcher could import an identified article, store it in a personalised online space, add comments and questions to the file, then share with an online social network which could include a research team, supervisors, or a cohort of students. Whatever filing system for research articles is used by a PhD student, it needs to be able to store, display and allow easy retrieval of anything that has been read over the duration of the study, which is not a simple task when this means five or six hundred individual references.

Online education

I appeared on Fred MacAulay’s programme on BBC Radio Scotland yesterday, talking about online education. The clip is at the end of this piece if you want to listen to it. (It only lasts ten minutes or so). As is often the case with TV and radio media, the content tends to be fairly superficial and fast-changing in order to appeal to the widest range of listeners, but the advantage is that there are a LOT of listeners! In a way, it is interesting how the comments on “degrees by post” or “get a degree without getting out of your pajamas” has given way to a serious radio discussion among a whole range of other items which are simply taken as ‘part of life’.http://www.bbc.co.uk/programmes/b052my7v

More on getting started

The highly flexible nature of the internet means that websites appear and disappear every day. Thanks to Donald MacLean for reminding me that another currently popular university-managed site containing useful resources for prospective PhD students is http://cloudworks.ac.uk/

This is a freely-available, open-access site, although you will need to register to obtain access. Once you have entered the site, search for “PhD” to find a link to “research skills required by PhD students”. The supporting text has short articles on a wide range of issues such as what is meant by ‘critical thinking’, how to select and justify your research methods, and tips on how to organise and present your work so that other people can appreciate your work.
Like all of these sites, this one will not answer all of your questions, but it does contain different perspectives and useful information from people who have a lot of experience. When you are just starting out on your PhD research, not all of this advice will seem equally relevant. It makes good sense, however, to familiarise yourself with the variety of information on the site, and bookmark the URL, because you might want to return to these topics later in your studies as these issues take on a new relevance. This advice also applies to the supervisor, because you might wish to direct your student to read the advice which will reinforce (or give a different perspective to) guidance that you give to students in tutorial sessions.