Comtrade Analysis Step 1

R-jupyter-Script
Author: Christian Brandstätter
Contact: bran.chri@gmail.com
Copyright (C) 2019

Description: Script to download UN-Comtrade data. A short notice on folders: For my projects, I try to follow a rough structure as found in here: https://towardsdatascience.com/how-to-keep-your-research-projects-organized-part-1-folder-structure-10bd56034d3a So there are mainly four subfolders: 0_data, 1_code, 2_pipeline, 3_output. The first two and output are quite obvious, the 2_pipeline folder contains intermediate results for analysis. Depending on the size of the project, it could be split into subfolders (ideally per Script) as well. This is done here.

The following script shows how to download UN-Comtrade data for Austria. This is a first step to create Austrian trade balances over time.
As an example, two years are selected. In total data from 1963 are available. It should be easily possible for possible users to adapt the for loop in the final cell of this session to get data for the whole timeperiod.
For the demonstration, ALL registered product-tradeflows are selected.

setwd("../") # set something appropriate, in my case, code is stored in 1_code, so the WD is now the project root.  
library(rjson)

# This cell shows available Trade Partners in the comtrade database. 
string <- "http://comtrade.un.org/data/cache/partnerAreas.json"
reporters <- fromJSON(file=string)
reporters <- as.data.frame(t(sapply(reporters$results,rbind)))
loc_fol <- paste0(getwd(),"/comtrade")
head(reporters)

V1	V2
all	All
0	World
4	Afghanistan
472	Africa CAMEU region, nes
8	Albania
12	Algeria

# Comtrade provides an API for R to download data. 
# The following function definition stems from here: https://comtrade.un.org/data/Doc/api/ex/r
get.Comtrade <- function(url="http://comtrade.un.org/api/get?"
                         ,maxrec=50000
                         ,type="C"
                         ,freq="A"
                         ,px="HS"
                         ,ps="now"
                         ,r
                         ,p
                         ,rg="all"
                         ,cc="TOTAL"
                         ,fmt="json"
)

{
  string<- paste(url
                 ,"max=",maxrec,"&" #maximum no. of records returned
                 ,"type=",type,"&" #type of trade (c=commodities)
                 ,"freq=",freq,"&" #frequency
                 ,"px=",px,"&" #classification
                 ,"ps=",ps,"&" #time period
                 ,"r=",r,"&" #reporting area
                 ,"p=",p,"&" #partner country
                 ,"rg=",rg,"&" #trade flow
                 ,"cc=",cc,"&" #classification code
                 ,"fmt=",fmt        #Format
                 ,sep = ""
  )

  if(fmt == "csv") {
    raw.data<- read.csv(string,header=TRUE)
    return(list(validation=NULL, data=raw.data))
  } else {
    if(fmt == "json" ) {
      raw.data<- fromJSON(file=string)
      data<- raw.data$dataset
      validation<- unlist(raw.data$validation, recursive=TRUE)
      ndata<- NULL
      if(length(data)> 0) {
        var.names<- names(data[[1]])
        data<- as.data.frame(t( sapply(data,rbind)))
        ndata<- NULL
        for(i in 1:ncol(data)){
          data[sapply(data[,i],is.null),i]<- NA
          ndata<- cbind(ndata, unlist(data[,i]))
        }
        ndata<- as.data.frame(ndata)
        colnames(ndata)<- var.names
      }
      return(list(validation=validation,data =ndata))
    }
  }
}

# This following section serves as example: 
# Creating a list of names for saving
timeperiod_1 <- as.character(seq(1995,1996)) # HERE AN ADAPTATION WOULD BE NECESSARY TO ACCESS DIFFERENT YEARS 
datname_1 <- paste0(loc_fol,"/comtrade_AUT_WLD_",timeperiod_1,".csv") # List of file names to store all the data

# Loop for downloading:
# Austrian Trade Data are stored in different classification systems 
# until 1993 ST
# from 1993 HS 
# 2014 + 2015: H4
# 1995 - 1998: H0

for(i in 1:length(timeperiod_1)){
  print(timeperiod_1[i])

        # this is the comtrade function:
        # r = "40":  Austria
        # freq = "Annual": Yearly Sums
        # type = "c": Trade of commodities
        # p = "0": Partner is World; ANY import or export is considered
        # ps = timeperiod_1[i]: the respective year in the loop
        # px = "H0": Classification System H0 # THIS CAN BE ALTERED ACCORDING TO THE LIST ABOVE 
        # rg = "1,2": Trade Flow (Import + Export)
        # cc = "all": All the trade groups are considered
        # fmt = "json": default data format for download

        s2 <- get.Comtrade(r="40",freq="Annual",type="c", p="0",ps=timeperiod_1[i],
                           px="H0",rg="1,2",cc="all",fmt="json")

    if(!is.null((s2$data))){ # if no data is downloaded for a year, no file is saved 
  write.csv(s2$data,datname_1[i],row.names=FALSE)
  }
}

[1] "1995"
[1] "1996"

Now we have downloaded the data, and we can continue the process to create interesting insights of Austrian Trade Flows! For the continuing example the download script was adapted and extended. Just be aware, that there is a download limit to the Comtrade Database.