Data Retrieval from Xinjiang Astronomical Observatory's Pulsar Data Archive
Hailong Zhang1,2, MarkusDemleitner3, Na Wang1,2, Jianping Yuan1,2, Jun Nie1,2, Jie Wang1     
1. Xinjiang Astronomical Observatory, Chinese Academy of Sciences, Urumqi 830011, China;
2. Key Laboratory of Radio Astronomy, Chinese Academy of Sciences, Nanjing 210008, China;
3. Heidelberg University, Zentrumfür Astronomie, Mönchhofstr. 12-14, 69120 Heidelberg, Germany
Abstract: Xinjiang Astronomical Observatory (XAO) Pulsar Data Archive currently provides access to 32,290 data files which have been obtained from observations carried out by Nanshan station 25M radio telescope since the year 2000. Both data files and access methods are compliant with the Virtual Observatory (VO) standards and protocols. This paper provides a tutorial on how to make use of XAO Pulsar Data Archive and how to use VO tools as well as on-line interface to visit this data archive; it also describes the data currently stored in the archive, and presents ways in which data can be searched and downloaded.
Key words: Pulsar data     Data center     Virtual observatory     Data query    
1 Introduction

Xinjiang Astronomical Observatory′s (XAO) data archive portal is the primary repository for XAO data products and the main interface to the science user community. XAO Pulsar Data Archive provides authorized access to 32290 files related for observations of pulsars recorded by XAO Nanshan station. The on-line interface is http://data.xao.ac.cn/pul/pulsar/q/form. Not all data is in the public domain yet; the permissions can be divided into two levels. The first level is for data preview,with username “pulsar” and password “astronomy” to login and browse the data,i.e. view the corresponding pulse profile and other detailed information,but one cannot download the original files.The second level authorization allows one to download the raw data files. For the data could only be used with the permission of Dr. Na Wang (na.wang@xao.ac.cn),please send her an email for your request.

http://www.xao.ac.cn

http://data.xao.ac.cn

The pulsar[1]timing data were obtained by the Nanshan 25M radio telescope. Our observations,which commenced in January 2000,have been made using a dual-channel room-temperature receiver with a bandwidth of 320MHz centered at 1540MHz before June 2002. The de-dispersion was provided by a 2X128X2.5MHz analog filter-bank (AFB). The format of the AFB data is “Timer”. A cryogenic receiver was mounted in July 2002,which increased the sensitivity to 0.5mJy. In January 2010,a digital filter-bank (DFB)[2]system came into operation. The higher time resolution allows us to monitor about 280 pulsars,including ten millisecond-pulsars (MSP). The format of the DFB data is “Psrfit”. The “psrchive”[3]program could read and analyze the data.

2 GAVO

GAVO[4]as the data release framework implemented XAO pulsar data release environment. German Astrophysical Virtual Observatory (GAVO) is German contribution to the IVOA (International Virtual Observatory Association),an international effort to create and expand the Virtual Observatory (VO). And GAVO′s services[5]are open to all astronomers as well as the general public.

http://www.ivoa.net/

http://www.g-vo.org

The goals of the Virtual Observatory are to allow or improve access to astronomical data of all kinds (astrometry,photometry,spectroscopy,time series,...) from everywhere in well-defined protocols,let astronomers easily discover,access and use data relevant to their researches,ensure that data does not simply disappear,that it is properly described and can be accessed and understood in the future; it also aims to provide software to help astronomers to use all of this.

3 Archive data sets and format

All of the pulsar data stored in the data archive follow PSRFITS[6]standard. Each file contains a single observation of a pulsar or a particular area of sky. As for the archived data,file names indicate the date and time of the observation. Folded pulsar archives have the file extension ‘.rf’. Calibration source files have the extension ‘.cf’ and observations obtained in search mode have ‘.sf’[7-8].

Digital filter-bank (DFB) system came into operation since January 2010. The main data collection for the archive started in 2010 and has been ongoing. Data have been recorded using an auto-correlation spectrometer,incoherent de-dispersion systems and the digital filter-banks (PDF3). PDFB3 systems directly produce PSRFITS data and we make no changes to the data files for inclusion into the archive. The size range of an individual data file is from 16MB to 1GB and the total size of the data archive is 4TB or even larger.

4 Obtaining the data

Follow the instructions below to access the pulsar query page:

(1) Navigate the web browser to http://data.xao.ac.cn/pul/pulsar/q/form.

(2) From XAO Date Access Portal home page http://data.xao.ac.cn (Fig. 1),click “XAO Pulsar Data Query” to access the data release page.

Figure 1 The main page of XAO Date Archive Portal
4.1 Interface features

The index page (Fig. 1) lists published services available through web browsers. A “[P]” in the service listing means that the service is password protected,either because it is too rough for public consumption or because the data providers want exclusive access. In either case,you can contact the site operators to inquire about access.

Fig. 2 shows the basic information of XAO Pulsar Data Query and the information will be showed after you click the black triangle before the “[p]”.

Figure 2 Brief information of XAO Pulsar Data Query service

More information can be seen by click the “[i]” link. Table 1 shows the fields of pulsar Table and the descriptions.

Table 1 Fields information of Pulsar Table
NameTable HeadDescription
accrefProduct keyAccess key for the data
backendBackend IDBackend ID
beconfigBackend Configuration File NameBackend Configuration File Name
date_obsDate_obsObservation date
frontendReceiver IDReceiver ID
objectTarget ObjectObject observed
object_oneTarget ObjectObject observed
obs_modeObservation ModeObservation Mode
obsbwObservation BandwidthObservation Bandwidth
observerObserverObserver Name
obsfreqObservation FrequencyObservation Frequency
obsnchanNumber of Frequency ChannelsNumber of Frequency Channels
posPosition/NameCoordinates (as h m s,d m s or decimal degrees),or SIMBAD-resolvable object
projidProject NameProject Name
srSrROI radius in degrees
4.2 Query Fields

Numeric expressions—you can recognize those from the little "[?num.expr.]" tag behind them. In addition to raw numbers,you can enter Vizier-like numeric expressions here.

String expressions—these have a little "[?char expr.]" tag and by default match using patterns,evaluating metacharacters like * or ? much like you may know from file name patterns. To force literal matching,prepend your strings with ==. Other operators available for string expressions include caseless matching,string comparisons,or negation.

Date expressions—these are marked with "[?date expr.]" tags. Dates must be given in the ISO format (YYYY-MM-DD). Among the most useful of the supported operators is the range—you can say “2004-01-02 .. 2005-05-01” to specify a range of dates between Jan 2nd,2004 and May 1st,2005.

Selection boxes—these are either drop down,in which case you can only select one entry,or open boxes that you can select more than one entry,usually (depending on user interface) using control-click.

Others—application-specific input fields (e.g., cone searches) should come with a short explanation.

4.3 Query Modifiers

For most queries,you will have a "Table" query field near the bottom of the form. You can set sorting and limit options there. Note that depending on the query,selecting a column to sort may slow down the answer dramatically. This is because your query may match large amounts of data,and even if only 100 items are returned,potentially millions of them may have been sorted. On the other hand,results overflowing the match limit are not reproducible without a sort option,i.e., you may get a different set of,say,100 items for identical query parameters at two different times. The services warn you about this fact when they return truncated results.

In cases in which the match limits provided by the form do not suit you,you can override the match limit by editing the result link (see below). You want to substitute your value into_DBOPTIONS_LIMIT=100. The system has hardcoded match limit that you cannot override in this way,but it is unlikely that it will hurt you.

You can usually select an output format. Options here include:

HTML—data is returned in your web browser. You can select additional columns in your output from an input field that pops down when you mouse over it.

http://www.w3.org/MarkUp/

VOTable——data is returned in IVOA′s standard data format,the VOTable. This is XML that can be human-readable but is really intended to be consumed by tools like Topcat. You can select a "verbosity" specifying the fields present in the output,with 1 standing for a minimal set of information,2 for what the service author deemed useful for the average astronomer,3 for (almost) all fields available,and finally H for (essentially) what the default HTML table gives. Furthermore,you can choose between a "human-readable" VOTable (select this to process your data with standard XML or text-based tools or peruse it with the naked eye) and a binary version that you should use for larger data sets since it is much more efficient.

http://www.ivoa.net/documents/latest/VOT.html

FITS—this returns FITS tables. The data is in the first extension. This contains much less meta information than a VOTable of the same data and thus should only be used if your backend tools do not understand VOTables.

http://fits.gsfc.nasa.gov/

TSV—tab separated files. If in a desperate pinch,you can get the table contents as an ASCII file. The fields are separated by tabs. All metadata is lost. Nullvalues are (almost) always rendered as the string "None". Strings containing non-ascii or control characters are rendered with C escapes (\\n,\\t,etc) or sedecimalunicodecodepoints (\\xe4,e.g.,is an ). Don′t use this unless you absolutely have to.

JSON—JavaScript Object Notation. It is based on a subset of the JavaScript Programming Language,Standard ECMA-262 3rd Edition-December 1999. JSON is a text format that is completely language independent but uses conventions that are familiar to programmers of the C-family of languages,including C,C++,C#,Java,JavaScript,Perl,Python,and many others. These properties make JSON an ideal data-interchange language.

http://www.json.org/js.html

CSV—comma separated values. This format carries almost no metadata as well,but it is understood by many database programs,spreadsheets,etc. (note: don′t use Excel to process astronomical data. TOPCAT is so much nicer,and it has built-in VO support) Null values are mostly rendered as empty fields,but float NULLs are NaNs.

Tar—Usually FITS files of some kind,you can download all matching items in a tar file.

4.4 Examples

This section will give three step-by-step procedures to query the data. The first one will provide a Cone Search,the second will show how to use the multiple constraints to do the query and the third one will try to message the query results to Virtual Observatory tools to do data visualization.

(1) Cone Search

Cone Search is an IVOA protocol which defines a simple query for retrieving records from a catalog of astronomical sources. The query describes sky position and an angular distance,defining a cone on the sky. The response returns a list of astronomical sources whose positions lie within the cone. To do the Cone Search we need three steps:

http://www.ivoa.net/Documents/PR/DAL/ConeSearch-20070628.html

Step 1.Open XAO Pulsar Data Query website by using a web browser.

Step 2.Input the search criteria in the search windows(shown in Fig. 3). RA,DEC coordinate and a search radius are needed. Coordinates (as h m s,d m s or decimal degrees),or SIMBAD-resolvable object.

Figure 3 Schematic diagram of Cone Search query

Step 3.Press the “GO” button (select an output forma and other parameters).

Step 4.Check the results.

(2) Multiple constraints query

Multiple constraints may contain several retrieval conditions. We can define more than one constraint for a given action. When there is more than one constraint for a given action,both constraints are enforced like in the following example constraints. As is shown in Fig. 4,there are four retrieval conditions: the first condition is the Target name,the second one is the observation date,the third one is the observation frequency and the last one is the bandwidth. Each constraint is evaluated and for each successful evaluation,the target sets are combined (OR operation) together.

Figure 4 Schematic diagram of multiple constraints query

(3) Data Visualization

As is shown by Fig. 5,we can get 4694 observational data records in 2013 by using “~*2013*” as the search condition in the Observation Date field. The data output format is HTML,and you can get the pulse profile previews when you mouse over the Product key field. Clicking the Preview link can get the large pulse profile image which is generated by “pav-DFTp” command from the PSRCHIVE software package. Fig. 6 shows the large pulse profile.

http://psrchive.sourceforge.net/

Figure 5 Schematic diagram of pulse profile
Figure 6 Large pulse profile preview by PSRCHIV

Fig. 7 shows the analysis results by TopCat.

http://topcat.switchinc.org/

TopCat is a Virtual Observatory tool from the AstroGrid software package (http://wwww.astrogrid.org).

Click the “send via SAMP”(the upper left corner in the Fig. 5) button to send the query data to TopCat. TopCat should be open before you use the “send via SAMP” function. TopCat′s plotting functionality becomes available once the query results have been sent successfully. Fig. 7 shows the pulsar positions on the 3D galactic coordinate sphere using the Spherical Plot.

http://astropy.readthedocs.org/en/latest/vo/samp/

Figure 7 Results visualization by Topcat
5 Conclusions

XAO Pulsar Data Archive currently provides access to 32,290 data files.We realized the cone search and multiple constraints query,provided HTML,VOTable,CSV,JSON and tar package output data formats. And the pulse profile previews can be obtained in the HTML output. The data management research results will be applied to large diameter radio telescope in Xinjiang in the future.

Acknowledgement:

The algorithm in this paper has applied Taurus High Performance Computing Cluster of Xinjiang Astronomical Observatory,CAS during the testing process.

参考文献
[1] Wang Na. Pulsar astronomy in China[J]. Chinese Journal of Astronomy and Astrophysics , 2006 , 6 (2) : 1 –3.
[2] Hampson G, Brown A. A 1GHz Pulsar Digital Filter Bank and RFI Mitigation system[M/OL]. 2008[2016-01-22]. http://www.jb.man.ac.uk/pulsar/observing/DFB.pdf. Hampson G, Brown A. A 1GHz Pulsar Digital Filter Bank and RFI Mitigation system[M/OL]. 2008[2016-01-22]. http://www.jb.man.ac.uk/pulsar/observing/DFB.pdf.
[3] Keith M J. Installation and use of pulsar search software[J]. Astronomical Research&Technology,Publications of National Astronomical Observatories of China , 2012 , 9 (3) : 219 –228.
[4] Demleitner M, Neves M C, Rothmaier F, et al. Virtual observatory publishing with DaCHS[J]. Astronomy and Computing , 2014 , 7 .
[5] Demleitner M, Gufler B, Kim J, et al. The German Astrophysical Virtual Observatory (GAVO):archives and applications, status and servicesm[J]. Astronomische Nachrichten , 2007 , 328 (7) : 713 .
[6] Hotan A W, van Straten W, Manchester R N. PSRCHIVE and PSRFITS:an open approach to radio pulsar data storage and analysis[J]. Publications of the Astronomical Society of Australia , 2004 , 21 (3) : 302 –309. DOI: 10.1071/AS04022
[7] Hobbs G, Miller D, Manchester R N, et al. The Parkes observatory pulsar data archive[J]. Publications of the Astronomical Society of Australia , 2011 , 28 (3) : 202 –214. DOI: 10.1071/AS11016
[8] Khoo J, Hobbs G, Manchester R N, et al. Using the Parkes pulsar data archive[J]. Astronomical Research&Technology,Publications of National Astronomical Observatories of China , 2012 , 9 (3) : 229 –238.
新疆天文台脉冲星数据检索
张海龙1,2, MarkusDemleitner3, 王娜1,2, 袁建平1,2, 聂俊1,2, 王杰1     
1. 中国科学院新疆天文台, 新疆 乌鲁木齐 830011;
2. 中国科学院射电天文重点实验室, 江苏 南京 210008;
3. 海德堡大学天文研究中心, 海德堡 69120, 德国
摘要: 新疆天文台目前已归档32290条脉冲星观测数据文件,脉冲星数据检索平台提供南山观测站25m射电望远镜自2000年以来获得的近300颗脉冲星的观测数据检索服务。数据文件和检索、访问方法符合虚拟天文台标准和协议。介绍了如何利用新疆天文台脉冲星数据检索平台获取数据,如何利用虚拟天文台相关工具对数据进行简单处理,及锥形检索、多约束目标检索方法的使用。
关键词: 脉冲星数据     数据中心     虚拟天文台     数据检索    
由中国科学院国家天文台主办。
0

文章信息

张海龙, MarkusDemleitner, 王娜, 袁建平, 聂俊, 王杰
Hailong Zhang, MarkusDemleitner, Na Wang, Jianping Yuan, Jun Nie, Jie Wang
新疆天文台脉冲星数据检索
Data Retrieval from Xinjiang Astronomical Observatory's Pulsar Data Archive
${metaVo.journalTitleCn}, 2016, 13(4): 473-480.
ASTRONOMICAL RESEARCH AND TECHNOLOGY, 2016, 13(4): 473-480.
收稿日期: 2016-02-22
修订日期:

工作空间