atelParser

https://zenodo.org/badge/146584890.svg

https://img.shields.io/pypi/pyversions/atelParser https://img.shields.io/pypi/v/atelParser https://img.shields.io/github/license/thespacedoctor/atelParser https://img.shields.io/pypi/dm/atelParser

http://157.245.42.153:8080/buildStatus/icon?job=atelParser%2Fmaster&subject=build%20master http://157.245.42.153:8080/buildStatus/icon?job=atelParser%2Fdevelop&subject=build%20dev https://cdn.jsdelivr.net/gh/thespacedoctor/atelParser@master/coverage.svg https://readthedocs.org/projects/atelparser/badge/?version=master https://img.shields.io/github/issues/thespacedoctor/atelParser/type:%20bug?label=bug%20issues

scrape and parse content of ATels posted on The Astronomer’s Telegram website, identify individual objects by name and coordinates.

Documentation for atelParser is hosted by Read the Docs (last stable version and latest version). The code lives on github. Please report any issues you find here.

Features

  • Report the latest ATel count

  • Download all ATel as raw HTML pages. After a first download, can be run on a regular basis to download only new/missing ATels.

  • Parse ATels to extract coordinates and transient source names to indexed MySQL database tables which can then be used in your own projects.

How to cite atelParser

If you use atelParser in your work, please cite using the following BibTeX entry:

@software{
    Young_atelParser,
    author = {Young, David R.},
    doi = {10.5281/zenodo.8037458},
    license = {GPL-3.0-only},
    title = ,
    url = {https://github.com/thespacedoctor/atelParser}
}

Installation

The easiest way to install atelParser is to use pip (here we show the install inside of a conda environment):

conda create -n atelParser python=3.7 pip
conda activate atelParser
pip install atelParser

Or you can clone the github repo and install from a local version of the code:

git clone git@github.com:thespacedoctor/atelParser.git
cd atelParser
python setup.py install

To upgrade to the latest version of atelParser use the command:

pip install atelParser --upgrade

To check installation was successful run atelParser -v. This should return the version number of the install.

Development

If you want to tinker with the code, then install in development mode. This means you can modify the code from your cloned repo:

git clone git@github.com:thespacedoctor/atelParser.git
cd atelParser
python setup.py develop

Pull requests are welcomed!

Initialisation

Before using atelParser you need to use the init command to generate a user settings file. Running the following creates a yaml settings file in your home folder under ~/.config/atelParser/atelParser.yaml:

atelParser init

The file is initially populated with atelParser’s default settings which can be adjusted to your preference.

If at any point the user settings file becomes corrupted or you just want to start afresh, simply trash the atelParser.yaml file and rerun atelParser init.

Modifying the Settings

Once created, open the settings file in any text editor and make any modifications needed. The most important setting is the atel-directory as this lets atelParser know where to download the ATel HTML files to. Change this value to your preferred location.

atel-directory: ~/git_repos/atel-archive/html

Basic Python Setup

If you plan to use atelParser in your own scripts you will first need to parse your settings file and set up logging etc. One quick way to do this is to use the fundamentals package to give you a logger, a settings dictionary and a database connection (if connection details given in settings file):

## SOME BASIC SETUP FOR LOGGING, SETTINGS ETC
from fundamentals import tools
from os.path import expanduser
home = expanduser("~")
settingsFile  = home + "/.config/atelParser/atelParser.yaml"
su = tools(
    arguments={"settingsFile": settingsFile},
    docString=__doc__,
)
arguments, settings, log, dbConn = su.setup()

Latest ATel Count

The simplest tool in the ATelParser toolbox is the latest ATel count, reporting the number of the last reported ATel.

From the Command-Line

To run the count from the command-line run:

> atel count
14318 ATels have been reported as of 2021/01/13 10:48:11s

From Python Code

To get the count from python use the get_latest_atel_number method:

from atelParser import download
atels = download(
    log=log,
    settings=settings
)
latestNumber = atels.get_latest_atel_number()

Downloading ATels

From the Command-Line

To download new/missing ATels run atel download from the command-line:

> atel download
Waiting for a randomly selected 35s before downloading ATel #14317
Waiting for a randomly selected 101s before downloading ATel #14318
...

Note a random time between 0-180s is injected between ATel page downloads so not to overwhelm the ATel servers.

From Python Code

Before you begin to code you will need to parse your settings file and set up logging etc. One quick way to do this is to use the fundamentals package to give you a logger, a settings dictionary and a database connection (if connection details given in settings file):

## SOME BASIC SETUP FOR LOGGING, SETTINGS ETC
from fundamentals import tools
from os.path import expanduser
home = expanduser("~")
settingsFile  = home + "/.config/atelParser/atelParser.yaml"
su = tools(
    arguments={"settingsFile": settingsFile},
    docString=__doc__,
)
arguments, settings, log, dbConn = su.setup()

Assuming you have set up your atel-directory location in the settings file (see Initialisation), you can download all new/missing ATels pages with the following code snippet.

## DOWNLOAD ALL NEW ATEL PAGES
from atelParser import download
atels = download(
    log=log,
    settings=settings
)
atelsToDownload = atels.get_list_of_atels_still_to_download()
atels.download_list_of_atels(atelsToDownload)

Once run, you should find one HTML file per ATel in your atel-directory folder. You can find more information on the download class here

Parsing ATels To A Database

After downloading the ATel HTML files you now have the option of adding the content of the ATels to a MySQL database and to parse this content to generate indexed tables of coordinates and transient source names.

Connection details are needed in the ATel settings file for the parser to access the database.

The parser will create and populate the following 3 tables.

  • atel_fullcontent: containing a list of ATels and their full-text content.

  • atel_names: a list of transient source names found via regex matching of the ATel text content. Transients from new surveys and mangled names my get missed (please report via github issues if you find a problem).

  • atel_coordinates: sky-position coordinates as parsed from the ATel content and converted to decimal degrees (also indexed via 3 different HTM level IDs). Some coordinates may have been missed if written in an obscure syntax (or just incorrectly).

The indexed transient source data in these tables can then be used in your own projects.

From the Command-Line

To parse the downloaded ATels from the command-line run:

> atel parse

From Python Code

If scripting the parsing of the ATels in your own code, use the mysql class to parse the ATels and ingest them into the MySQL database tables:

from atelParser import mysql
parser = mysql(
    log=log,
    settings=settings,
    reParse=reparseFlag
)
parser.atels_to_database()
parser.parse_atels()
parser.populate_htm_columns()

Todo List

Todo

  • Make sure todo list is working. ✓

(The original entry is located in /home/docs/checkouts/readthedocs.org/user_builds/atelparser/checkouts/master/docs/source/_template_.md, line 1.)

Release Notes

v1.0.2 - May 10, 2022

  • Fixed: docs now building

v1.0.1 - January 14, 2021

  • Fixed: dependency clash with other packages for pymysql version

v1.0.0 - January 13, 2021

  • ENHANCEMENT full documentation

v0.4.0 - May 4, 2020

  • Now compatible with Python 3.*

  • Fixed: adding requests, pymysql and pandas as dependencies

API Reference

Modules

atelParser.commonutils

common tools used throughout package

atelParser.utKit

Unit testing tools

Classes

atelParser.download

Download ATels as Raw HTML files

atelParser.mysql

Import ATel into MySQL database and parse for names and coordinates

Functions

atelParser.mysql.clean_supernova_name

Clean a SN name.

A-Z Index

Modules

atelParser.commonutils

common tools used throughout package

atelParser.utKit

Unit testing tools

Classes

atelParser.download

Download ATels as Raw HTML files

atelParser.mysql

Import ATel into MySQL database and parse for names and coordinates

Functions

atelParser.mysql.clean_supernova_name

Clean a SN name.

Release Notes

v1.0.2 - May 10, 2022

  • Fixed: docs now building

v1.0.1 - January 14, 2021

  • Fixed: dependency clash with other packages for pymysql version

v1.0.0 - January 13, 2021

  • ENHANCEMENT full documentation

v0.4.0 - May 4, 2020

  • Now compatible with Python 3.*

  • Fixed: adding requests, pymysql and pandas as dependencies