Compare commits

..

No commits in common. "master" and "v1.2.9" have entirely different histories.

62 changed files with 2854 additions and 1185 deletions

View File

@ -1,101 +0,0 @@
name: CI Build Wheel
on:
push:
# branches: [master]
tags: ["v[0-9]+.[0-9]+.[0-9]+"]
jobs:
build_setup:
name: Prepare environment for wheel builds
runs-on: ubuntu-24.04
steps:
- uses: actions/checkout@v2
- name: Prepare wheel build
run: make -C python/ setup
- name: Store wheel configuration files
uses: actions/upload-artifact@v4.6.0
with:
name: wheel-config
path: python/
- name: Display files
run: ls -lR
build_wheels:
name: Build binary wheels on ${{ matrix.os }}
runs-on: ${{ matrix.os }}
needs: [build_setup]
strategy:
matrix:
os: [ubuntu-latest, windows-latest]
steps:
- uses: actions/checkout@v2
- uses: actions/setup-python@v2
- name: Install cibuildwheel
run: python -m pip install cibuildwheel==2.1.2
- name: Get wheel configuration files
uses: actions/download-artifact@v4.1.7
with:
name: wheel-config
path: python/
- name: Build wheels
run: python -m cibuildwheel --output-dir wheelhouse
working-directory: python/
- name: Store binary wheels
uses: actions/upload-artifact@v4.6.0
with:
name: binary-wheels-${{matrix.os}}-${{ strategy.job-index }}
path: python/wheelhouse/*.whl
build_sdist:
name: Build source distribution
runs-on: ubuntu-24.04
needs: [build_setup]
steps:
- uses: actions/checkout@v2
- name: Install cython
run: python -m pip install cython==0.29.24
- name: Get wheel configuration files
uses: actions/download-artifact@v4.1.7
with:
name: wheel-config
path: python/
- name: Build sdist
run: python setup.py sdist
working-directory: python/
- name: Store source wheels
uses: actions/upload-artifact@v4.6.0
with:
name: source-wheels
path: python/dist/*.tar.gz
- name: Display files
run: ls -lR
upload_pypi:
name: Upload wheels to PyPI
runs-on: ubuntu-24.04
needs: [build_wheels, build_sdist]
steps:
- name: Get source wheels
uses: actions/download-artifact@v4.1.7
with:
name: source-wheels
path: dist/
- name: Get binary wheels
uses: actions/download-artifact@v4.1.7
with:
path: dist/
pattern: binary-wheels-*
merge-multiple: true
- name: Display files
run: ls -lR
- uses: pypa/gh-action-pypi-publish@release/v1
with:
user: __token__
password: ${{ secrets.IMCTERMITE_GITHUB_WORKFLOW_PYPI_API_TOKEN }}

14
.gitignore vendored
View File

@ -2,7 +2,6 @@
eatraw
eatdev
imctermite
IMCtermite
nohup.out
@ -16,7 +15,6 @@ cython/*.cpp
*.log
*.so
*.pyd
*.o
*.csv
*.parquet
@ -31,15 +29,3 @@ pip/README.md
pip/LICENSE
pip/*egg-info
pip/dist/
pip/build/
python/README.md
python/README.rst
python/LICENSE
python/build
python/*.egg-info
python/dist
python/*.soc
python/lib/
python/*.cpp
python/wheelhouse/

View File

@ -1,13 +1,14 @@
[![Total alerts](https://img.shields.io/lgtm/alerts/g/RecordEvolution/IMCtermite.svg?logo=lgtm&logoWidth=18)](https://lgtm.com/projects/g/RecordEvolution/IMCtermite/alerts/)
[![Language grade: C/C++](https://img.shields.io/lgtm/grade/cpp/g/RecordEvolution/IMCtermite.svg?logo=lgtm&logoWidth=18)](https://lgtm.com/projects/g/RecordEvolution/IMCtermite/context:cpp)
[![Language grade: Python](https://img.shields.io/lgtm/grade/python/g/RecordEvolution/IMCtermite.svg?logo=lgtm&logoWidth=18)](https://lgtm.com/projects/g/RecordEvolution/IMCtermite/context:python)
[![LICENSE](https://img.shields.io/github/license/RecordEvolution/IMCtermite)](https://img.shields.io/github/license/RecordEvolution/IMCtermite)
[![STARS](https://img.shields.io/github/stars/RecordEvolution/IMCtermite)](https://img.shields.io/github/stars/RecordEvolution/IMCtermite)
![CI Build Wheel](https://github.com/RecordEvolution/IMCtermite/actions/workflows/pypi-deploy.yml/badge.svg?branch=&event=push)
[![PYPI](https://img.shields.io/pypi/v/IMCtermite.svg)](https://pypi.org/project/imctermite/)
# IMCtermite
_IMCtermite_ provides access to the proprietary data format
_IMC2 Data Format_ with the file extension _.raw_ (or .dat) introduced and developed by
_IMC Bus Format_ with the file extension _.raw_ introduced and developed by
[imc Test & Measurement GmbH](https://www.imc-tm.de/). This data format is
employed i.a. by the measurement hardware
[imc CRONOSflex](https://www.imc-tm.de/produkte/messtechnik-hardware/imc-cronosflex/ueberblick/)
@ -18,9 +19,7 @@ for measurement data control and analysis. Thanks to the integrated Python modul
the extracted measurement data can be stored in any open-source file format
accessible by Python like i.a. _csv_, _json_ or _parquet_.
On the [Record Evolution Platform](https://www.record-evolution.de/en/home-en/),
the library can be used both as a command line tool for interactive usage and as a
Python module to integrate the _.raw_ format into any ETL workflow.
On the [Record Evolution Platform](https://www.record-evolution.de/en/home-en/), the library can be used both as a command line tool for interactive usage and as a Python module to integrate the _.raw_ format into any ETL workflow.
## Overview
@ -31,11 +30,10 @@ Python module to integrate the _.raw_ format into any ETL workflow.
## File format
A file of the _IMC2 Data Format_ type with extension _.raw_ (or .dat) is a _mixed text/binary
A data file of the _IMC Bus Format_ type with the extension _.raw_ is a _mixed text/binary
file_ featuring a set of markers (keys) that indicate the start of various blocks
of data that provide meta information and the actual measurement data. Every single
marker is introduced by the character `"|" = 0x 7c` followed by two uppercase letters that
characterize the type of marker. Each block is further divided into several
marker is introduced by the character `"|" = 0x 7c` followed by two uppercase letters that characterize the type of marker. Each block is further divided into several
parameters separated by commata `"," = 0x 2c` and terminated by a semicolon
`";" = 0x 3b`. For instance, the header - first 600 bytes - of a raw file may
look like this (in UTF-8 encoding):
@ -131,28 +129,28 @@ which may require root permissions.
### Python
To integrate the library into a customized ETL toolchain, several python targets
To integrate the library into a customized ETL toolchain, several cython targets
are available. For a local build that enables you to run the examples, use:
```
make python-build
make cython-build
```
However, in a production environment, a proper installation of the module with
`make cython-install` is recommended for system-wide availability of the module.
#### Installation with pip
The package is also available in the [Python Package Index](https://pypi.org)
at [imctermite](https://pypi.org/project/imctermite/).
at [IMCtermite](https://pypi.org/project/IMCtermite/).
To install the latest version simply do
```Shell
python3 -m pip install imctermite
python3 -m pip install IMCtermite
```
which provides binary wheels for multiple architectures on _Windows_ and _Linux_
and most _Python 3.x_ distributions. However, if your platform/architecture is
not supported you can still compile the source distribution yourself, which
requires _python3_setuptools_ and an up-to-date compiler supporting C++11
standard (e.g. _gcc version >= 10.2.0_).
Note, that _python3_setuptools_ and _gcc version >= 10.2.0_ are required to
successfully install and use it.
## Usage
@ -188,23 +186,23 @@ options `imctermite sample-data.raw -b -c -s '|'`.
### Python
Given the `IMCtermite` module is available, we can import it and declare an instance
Given the `imctermite` module is available, we can import it and declare an instance
of it by passing a _raw_ file to the constructor:
```Python
import imctermite
import imc_termite
imcraw = imctermite.imctermite(b"sample/sampleA.raw")
imcraw = imc_termite.imctermite(b"sample/sampleA.raw")
```
An example of how to create an instance and obtain the list of channels is:
```Python
import IMCtermite
import imc_termite
# declare and initialize instance of "imctermite" by passing a raw-file
try :
imcraw = IMCtermite.imctermite(b"samples/sampleA.raw")
imcraw = imc_termite.imctermite(b"samples/sampleA.raw")
except RuntimeError as e :
print("failed to load/parse raw-file: " + str(e))
@ -213,37 +211,15 @@ channels = imcraw.get_channels(False)
print(channels)
```
A more complete [example](python/examples/usage.py), including the methods for
obtaining the channels, i.a. their data and/or directly printing them to files,
can be found in the `python/examples` folder.
A more complete [example](python/usage.py), including the methods for obtaining the
channels, i.a. their data and/or directly printing them to files, can be found
in the Python folder.
## References
### IMC
- https://www.imc-tm.de/produkte/messtechnik-software/imc-famos/funktionen/im-und-export/
- https://www.imc-tm.de/produkte/messtechnik-hardware/imc-cronosflex/ueberblick/
- https://www.imc-tm.de/download-center/produkt-downloads/imc-famos/handbuecher
- https://www.imc-tm.de/fileadmin/Public/Downloads/Manuals/imc_FAMOS/imcGemeinsameKomponenten.pdf
- https://cython.readthedocs.io/en/latest/src/userguide/wrapping_CPlusPlus.html
- https://github.com/Apollo3zehn/ImcFamosFile
- https://apollo3zehn.github.io/ImcFamosFile/api/ImcFamosFile.FamosFileKeyType.html
### Cython
- https://cython.readthedocs.io/en/latest/src/userguide/wrapping_CPlusPlus.html
### PyPI
- https://pypi.org/help/#apitoken
- https://sgoel.dev/posts/uploading-binary-wheels-to-pypi-from-github-actions/
- https://docs.github.com/en/actions/reference/workflow-syntax-for-github-actions#jobsjob_idstepsrun
- https://github.com/pypa/cibuildwheel/blob/main/examples/github-deploy.yml
- https://cibuildwheel.readthedocs.io/en/stable/deliver-to-pypi/
- https://github.com/actions/download-artifact#download-all-artifacts
- https://github.com/actions/download-artifact?tab=readme-ov-file#download-multiple-filtered-artifacts-to-the-same-directory
### iconv
- https://www.gnu.org/software/libiconv/
- https://vcpkg.io/en/packages.html
- https://vcpkg.io/en/getting-started

111
assets/imctermite.svg Normal file
View File

@ -0,0 +1,111 @@
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<svg
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:cc="http://creativecommons.org/ns#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:svg="http://www.w3.org/2000/svg"
xmlns="http://www.w3.org/2000/svg"
xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd"
xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"
height="78.080002"
width="325.72"
id="svg3945"
version="1.1"
viewBox="0 0 325.72 78.08"
sodipodi:docname="imctermite.svg"
inkscape:version="1.0.1 (3bc2e813f5, 2020-09-07)">
<sodipodi:namedview
pagecolor="#ffffff"
bordercolor="#666666"
borderopacity="1"
objecttolerance="10"
gridtolerance="10"
guidetolerance="10"
inkscape:pageopacity="0"
inkscape:pageshadow="2"
inkscape:window-width="2048"
inkscape:window-height="1088"
id="namedview18"
showgrid="false"
inkscape:zoom="2.5577614"
inkscape:cx="138.13369"
inkscape:cy="-27.086877"
inkscape:window-x="0"
inkscape:window-y="32"
inkscape:window-maximized="1"
inkscape:current-layer="svg3945"
inkscape:document-rotation="0" />
<metadata
id="metadata3951">
<rdf:RDF>
<cc:Work
rdf:about="">
<dc:format>image/svg+xml</dc:format>
<dc:type
rdf:resource="http://purl.org/dc/dcmitype/StillImage" />
<dc:title>flasher</dc:title>
</cc:Work>
</rdf:RDF>
</metadata>
<defs
id="defs3949" />
<title
id="title3916">flasher</title>
<g
id="logog"
transform="translate(0,0.99981694)">
<path
style="fill:#364d5c"
transform="translate(-2.04,-1.15)"
d="m 32.86,2 -13,7.5 v 0 h -0.05 v 0 l -0.48,0.28 c -4.27,2.46 -5.68,11.38 -6.06,14.75 L 36.2,11.33 c 0.39,-0.19 7.6,-3.69 13.57,-3.69 h 0.14 L 40.13,2 a 8.15,8.15 0 0 0 -7.27,0"
id="path138"
inkscape:connector-curvature="0" />
<path
style="fill:#364d5c"
transform="translate(-2.04,-1.15)"
d="M 5.68,17.69 A 8.2,8.2 0 0 0 2,24 v 15.78 c 0,4.9 7,10.48 9.75,12.46 V 25.77 c 0,-0.44 0.6,-8.55 3.65,-13.72 z"
id="path142"
inkscape:connector-curvature="0" />
<path
style="fill:#364d5c"
transform="translate(-2.04,-1.15)"
d="m 12.1,54.12 v 0 C 11.74,53.88 5,49.41 2,44.24 v 11.14 a 8.2,8.2 0 0 0 3.64,6.3 l 13.5,7.79 c 4.28,2.46 12.7,-0.77 15.81,-2.12 z"
id="path146"
inkscape:connector-curvature="0" />
<path
style="fill:#364d5c"
transform="translate(-2.04,-1.15)"
d="m 36.79,68 c -0.4,0.19 -7.71,3.75 -13.71,3.69 l 9.78,5.64 a 8.15,8.15 0 0 0 7.27,0 l 13.51,-7.8 c 4.27,-2.46 5.68,-11.39 6.06,-14.75 z"
id="path150"
inkscape:connector-curvature="0" />
<path
style="fill:#364d5c"
transform="translate(-2.04,-1.15)"
d="M 61.2,27.13 V 53.6 c 0,0.44 -0.6,8.55 -3.65,13.72 l 9.77,-5.64 A 8.2,8.2 0 0 0 71,55.38 V 39.59 c 0,-4.94 -7,-10.5 -9.75,-12.46"
id="path154"
inkscape:connector-curvature="0" />
<path
style="fill:#364d5c"
transform="translate(-2.04,-1.15)"
d="M 67.31,17.69 53.81,9.9 C 49.53,7.44 41.11,10.67 38,12 l 22.85,13.23 v 0 a 43.43,43.43 0 0 1 5.7,4.51 24,24 0 0 1 4.45,5.35 V 24 a 8.2,8.2 0 0 0 -3.64,-6.3"
id="path158"
inkscape:connector-curvature="0" />
</g>
<g
id="re"
transform="translate(0,0.99981694)" />
<text
id="text3955"
y="55.47554"
x="74.101189"
style="font-style:normal;font-weight:normal;font-size:40px;line-height:1.25;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#364d5c;fill-opacity:1;stroke:none"
xml:space="preserve"><tspan
style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:44px;font-family:sans-serif;-inkscape-font-specification:'sans-serif, Bold';font-variant-ligatures:normal;font-variant-caps:normal;font-variant-numeric:normal;font-feature-settings:normal;text-align:start;writing-mode:lr-tb;text-anchor:start;fill:#364d5c;fill-opacity:1"
y="55.47554"
x="74.101189"
id="tspan3953"><tspan
id="tspan24"
style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-family:sans-serif;-inkscape-font-specification:'sans-serif Bold'">IMC</tspan><tspan
id="tspan3845"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-family:sans-serif;-inkscape-font-specification:sans-serif">termite</tspan> </tspan></text>
</svg>

After

Width:  |  Height:  |  Size: 4.5 KiB

112
assets/raweater.svg Normal file
View File

@ -0,0 +1,112 @@
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<svg
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:cc="http://creativecommons.org/ns#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:svg="http://www.w3.org/2000/svg"
xmlns="http://www.w3.org/2000/svg"
xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd"
xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"
height="78.080002"
width="290.72"
id="svg3945"
version="1.1"
viewBox="0 0 290.72 78.08"
sodipodi:docname="raweater.svg"
inkscape:version="0.92.5 (2060ec1f9f, 2020-04-08)">
<sodipodi:namedview
pagecolor="#ffffff"
bordercolor="#666666"
borderopacity="1"
objecttolerance="10"
gridtolerance="10"
guidetolerance="10"
inkscape:pageopacity="0"
inkscape:pageshadow="2"
inkscape:window-width="1360"
inkscape:window-height="704"
id="namedview18"
showgrid="false"
inkscape:zoom="0.90430522"
inkscape:cx="191.86"
inkscape:cy="38.540001"
inkscape:window-x="0"
inkscape:window-y="27"
inkscape:window-maximized="1"
inkscape:current-layer="svg3945" />
<metadata
id="metadata3951">
<rdf:RDF>
<cc:Work
rdf:about="">
<dc:format>image/svg+xml</dc:format>
<dc:type
rdf:resource="http://purl.org/dc/dcmitype/StillImage" />
<dc:title>flasher</dc:title>
</cc:Work>
</rdf:RDF>
</metadata>
<defs
id="defs3949" />
<title
id="title3916">flasher</title>
<g
id="logog"
transform="translate(0,0.99981694)">
<path
style="fill:#364d5c"
transform="translate(-2.04,-1.15)"
d="m 32.86,2 -13,7.5 v 0 h -0.05 v 0 l -0.48,0.28 c -4.27,2.46 -5.68,11.38 -6.06,14.75 L 36.2,11.33 c 0.39,-0.19 7.6,-3.69 13.57,-3.69 h 0.14 L 40.13,2 a 8.15,8.15 0 0 0 -7.27,0"
id="path138"
inkscape:connector-curvature="0" />
<path
style="fill:#364d5c"
transform="translate(-2.04,-1.15)"
d="M 5.68,17.69 A 8.2,8.2 0 0 0 2,24 v 15.78 c 0,4.9 7,10.48 9.75,12.46 V 25.77 c 0,-0.44 0.6,-8.55 3.65,-13.72 z"
id="path142"
inkscape:connector-curvature="0" />
<path
style="fill:#364d5c"
transform="translate(-2.04,-1.15)"
d="m 12.1,54.12 v 0 C 11.74,53.88 5,49.41 2,44.24 v 11.14 a 8.2,8.2 0 0 0 3.64,6.3 l 13.5,7.79 c 4.28,2.46 12.7,-0.77 15.81,-2.12 z"
id="path146"
inkscape:connector-curvature="0" />
<path
style="fill:#364d5c"
transform="translate(-2.04,-1.15)"
d="m 36.79,68 c -0.4,0.19 -7.71,3.75 -13.71,3.69 l 9.78,5.64 a 8.15,8.15 0 0 0 7.27,0 l 13.51,-7.8 c 4.27,-2.46 5.68,-11.39 6.06,-14.75 z"
id="path150"
inkscape:connector-curvature="0" />
<path
style="fill:#364d5c"
transform="translate(-2.04,-1.15)"
d="M 61.2,27.13 V 53.6 c 0,0.44 -0.6,8.55 -3.65,13.72 l 9.77,-5.64 A 8.2,8.2 0 0 0 71,55.38 V 39.59 c 0,-4.94 -7,-10.5 -9.75,-12.46"
id="path154"
inkscape:connector-curvature="0" />
<path
style="fill:#364d5c"
transform="translate(-2.04,-1.15)"
d="M 67.31,17.69 53.81,9.9 C 49.53,7.44 41.11,10.67 38,12 l 22.85,13.23 v 0 a 43.43,43.43 0 0 1 5.7,4.51 24,24 0 0 1 4.45,5.35 V 24 a 8.2,8.2 0 0 0 -3.64,-6.3"
id="path158"
inkscape:connector-curvature="0" />
</g>
<g
id="re"
transform="translate(0,0.99981694)" />
<text
id="text3955"
y="55.47554"
x="74.101189"
style="font-style:normal;font-weight:normal;font-size:40px;line-height:1.25;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#364d5c;fill-opacity:1;stroke:none"
xml:space="preserve"><tspan
style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-size:44px;font-family:sans-serif;-inkscape-font-specification:'sans-serif, Bold';font-variant-ligatures:normal;font-variant-caps:normal;font-variant-numeric:normal;font-feature-settings:normal;text-align:start;writing-mode:lr-tb;text-anchor:start;fill:#364d5c;fill-opacity:1"
y="55.47554"
x="74.101189"
id="tspan3953"><tspan
id="tspan24"
style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-family:sans-serif;-inkscape-font-specification:'sans-serif Bold'">R</tspan><tspan
id="tspan3845"
style="font-style:normal;font-variant:normal;font-weight:normal;font-stretch:normal;font-family:sans-serif;-inkscape-font-specification:sans-serif">aw<tspan
style="font-style:normal;font-variant:normal;font-weight:bold;font-stretch:normal;font-family:sans-serif;-inkscape-font-specification:'sans-serif Bold'"
id="tspan22">E</tspan>ater</tspan> </tspan></text>
</svg>

After

Width:  |  Height:  |  Size: 4.6 KiB

View File

@ -1,23 +1,19 @@
# cython: language_level = 3
# use some C++ STL libraries
from libcpp.string cimport string
from libcpp.vector cimport vector
from libcpp cimport bool
cdef extern from "lib/imc_raw.hpp" namespace "imc":
cdef cppclass cppimctermite "imc::raw":
cdef extern from "imc_raw.hpp" namespace "imc":
cdef cppclass imc_termite "imc::raw":
# constructor(s)
cppimctermite() except +
cppimctermite(string rawfile) except +
imc_termite() except +
imc_termite(string rawfile) except +
# provide raw file
void set_file(string rawfile) except +
# get JSON list of channels
vector[string] get_channels(bool json, bool data) except +
# print single channel/all channels
void print_channel(string channeluuid, string outputdir, char delimiter) except +
void print_channels(string outputdir, char delimiter) except +

View File

@ -1,49 +1,39 @@
# distutils: language = c++
# cython: language_level = 3
from imctermite cimport cppimctermite
from imc_termite cimport imc_termite
import json as jn
import decimal
import platform
# auxiliary function for codepage conversion
def get_codepage(chn) :
if platform == 'Windows' :
chndec = jn.loads(chn.decode(errors="ignore"))
chncdp = chndec["codepage"]
return 'utf-8' if chncdp is None else chncdp
else :
return 'utf-8'
# import numpy as np
cdef class imctermite:
# C++ instance of class => stack allocated (requires nullary constructor!)
cdef cppimctermite cppimc
cdef imc_termite cpp_imc
# constructor
def __cinit__(self, string rawfile):
self.cppimc = cppimctermite(rawfile)
self.cpp_imc = imc_termite(rawfile)
# provide raw file
def submit_file(self,string rawfile):
self.cppimc.set_file(rawfile)
self.cpp_imc.set_file(rawfile)
# get JSON list of channels
def get_channels(self, bool include_data):
chnlst = self.cppimc.get_channels(True,include_data)
chnlstjn = [jn.loads(chn.decode(get_codepage(chn),errors="ignore")) for chn in chnlst]
def get_channels(self, bool data):
chnlst = self.cpp_imc.get_channels(True,data)
chnlstjn = [jn.loads(chn.decode(errors="ignore")) for chn in chnlst]
return chnlstjn
# print single channel/all channels
def print_channel(self, string channeluuid, string outputfile, char delimiter):
self.cppimc.print_channel(channeluuid,outputfile,delimiter)
self.cpp_imc.print_channel(channeluuid,outputfile,delimiter)
def print_channels(self, string outputdir, char delimiter):
self.cppimc.print_channels(outputdir,delimiter)
self.cpp_imc.print_channels(outputdir,delimiter)
# print table including channels
def print_table(self, string outputfile):
chnlst = self.cppimc.get_channels(True,True)
chnlst = self.cpp_imc.get_channels(True,True)
chnlstjn = [jn.loads(chn.decode(errors="ignore")) for chn in chnlst]
with open(outputfile.decode(),'w') as fout:
for chn in chnlstjn:

41
cython/raw_eater.pxd Normal file
View File

@ -0,0 +1,41 @@
# cython: language_level = 3
# distutils: language = c++
# use some C++ STL libraries
from libcpp.string cimport string
from libcpp.vector cimport vector
from libcpp cimport bool
# to include implemenation/definition file
#cdef extern from "raweat.cpp":
# pass
# these method names have to match the C definitions of the methods!!
#
# for how to overload the constructor see
# https://cython.readthedocs.io/en/latest/src/userguide/wrapping_CPlusPlus.html
# and propagating exceptions from C++ to Python
# http://docs.cython.org/en/latest/src/userguide/wrapping_CPlusPlus.html#exceptions
cdef extern from "../lib/raweat.hpp":
cdef cppclass raw_eater:
# constructor(s)
raw_eater() except +
raw_eater(string) except +
# set new file for decoding
void set_file(string)
# perform conversion (pass any C++ exceptions to Python)
void setup_and_conversion() except +
# get validity of data format
bool get_valid()
# get channel name and unit
string get_name()
string get_unit()
# get time step and time unit
double get_dt()
string get_temp_unit()
# get data array of time and measured quantity's channel
vector[double] get_time()
vector[double] get_data()
# dump all data to .csv
void write_table(const char*,char delimiter)

58
cython/raw_eater.pyx Normal file
View File

@ -0,0 +1,58 @@
from raw_eater cimport raweater
import numpy as np
import re
import os
cdef class raweater:
# C++ instance of class => stack allocated (requires nullary constructor!)
cdef raw_eater rawit
# pointer to C++ instance (if there's no nullary constructor)
# cdef raw_eater *rawit
def __cinit__(self, string rawfile = b''):
if rawfile.decode() == "":
self.rawit = raw_eater()
# self.rawit = new raw_eater()
else:
if not os.path.isfile(rawfile) :
raise ValueError("'" + str(rawfile) + "' does not exist")
self.rawit = raw_eater(rawfile)
# self.rawit = new raw_eater(rawfile)
# def __dealloc__(self):
# del self.rawit
def set_file(self, string rawfile):
if not os.path.isfile(rawfile) :
raise ValueError("'" + str(rawfile) + "' does not exist")
self.rawit.set_file(rawfile)
def do_conversion(self):
self.rawit.setup_and_conversion()
def validity(self):
return self.rawit.get_valid()
def channel_name(self):
return self.rawit.get_name()
def unit(self):
return self.rawit.get_unit()
def dt(self):
return self.rawit.get_dt()
def time_unit(self):
return self.rawit.get_temp_unit()
def get_time(self):
return self.rawit.get_time()
def get_channel(self):
return self.rawit.get_data()
def write_table(self, const char* csvfile, char delimiter):
self.rawit.write_table(csvfile,delimiter)

37
cython/raw_meat.pxd Normal file
View File

@ -0,0 +1,37 @@
# cython: language_level = 3
# distutils: language = c++
# use some C++ STL libraries
from libcpp.string cimport string
from libcpp.vector cimport vector
from libcpp cimport bool
# these method names have to match the C++ definitions of the methods!!
cdef extern from "../lib/rawmerge.hpp":
cdef cppclass raw_merger:
raw_merger(string) except +
# get validity of data format
bool get_valid()
# get channel name and unit
string get_name()
string get_unit()
# get time step and time unit
double get_dt()
string get_temp_unit()
# get data array of time and measured quantity's channel
vector[double] get_time()
vector[double] get_data()
# dump all data to .csv
void write_table(const char*,char)
# add channel and try to merge it (pass C++ exceptions to Python)
bool add_channel(string) except +
# get total number of (added) channels
int get_num_channels()
# get list of channel names
vector[string] get_channel_names()
# get data of particular channel
vector[double] get_channel(int)
# get total merged time series
vector[double] get_time_series()
# dump all channels to .csv
void write_table_all(const char*,char)

58
cython/raw_meat.pyx Normal file
View File

@ -0,0 +1,58 @@
# from <raw_meat> has to match name of .pxd file and cimport name of class defined in .pxd
from raw_meat cimport raw_merger
import numpy as np
import re
cdef class rawmerger:
# pointer to C++ instance (since there's no nullary constructor)
cdef raw_merger *rawit
def __cinit__(self, string rawfile):
self.rawit = new raw_merger(rawfile)
def __dealloc__(self):
del self.rawit
def validity(self):
return self.rawit.get_valid()
def channel_name(self):
return self.rawit.get_name()
def unit(self):
return self.rawit.get_unit()
def dt(self):
return self.rawit.get_dt()
def time_unit(self):
return self.rawit.get_temp_unit()
def get_time(self):
return self.rawit.get_time()
def get_channel(self):
return self.rawit.get_data()
def write_table(self, const char* csvfile, char delimiter):
return self.rawit.write_table(csvfile,delimiter)
def add_channel(self, string rawfile):
return self.rawit.add_channel(rawfile)
def get_num_channels(self):
return self.rawit.get_num_channels()
def get_channel_names(self):
return self.rawit.get_channel_names()
def get_channel_by_index(self, int chidx):
return self.rawit.get_channel(chidx)
def get_time_series(self):
return self.rawit.get_time_series()
def write_table_all(self, const char* csvfile, char delimiter):
return self.rawit.write_table_all(csvfile,delimiter)

24
cython/setup.py Normal file
View File

@ -0,0 +1,24 @@
from distutils.core import setup
from distutils.extension import Extension
from Cython.Build import cythonize
extensions = Extension(
name="imc_termite",
sources=["cython/py_imc_termite.pyx"],
# libraries=[""],
# library_dirs=["lib"],
include_dirs=["lib"],
language='c++',
extra_compile_args=['-std=c++17','-Wno-unused-variable'],
extra_link_args=['-std=c++17'],
)
setup(
name="imc_termite",
version='1.2.8',
description='IMCtermite cython extension',
author='Record Evolution GmbH',
author_email='mario.fink@record-evolution.de',
url='https://github.com/RecordEvolution/IMCtermite.git',
ext_modules=cythonize(extensions,force=True)
)

20
cython/setup_raw_eater.py Normal file
View File

@ -0,0 +1,20 @@
from distutils.core import setup
from distutils.extension import Extension
from Cython.Build import cythonize
extensions = Extension(
name="raw_eater",
sources=["cython/raw_eater.pyx"],
# libraries=[""],
library_dirs=["src"],
include_dirs=["src"],
language='c++',
extra_compile_args=['-std=c++11','-Wno-unused-variable'],
extra_link_args=['-std=c++11'],
#extra_objects=["lib/parquet/libarrow.so.200.0.0"],
)
setup(
name="raw_eater",
ext_modules=cythonize(extensions)
)

20
cython/setup_raw_meat.py Normal file
View File

@ -0,0 +1,20 @@
from distutils.core import setup
from distutils.extension import Extension
from Cython.Build import cythonize
extensions = Extension(
name="raw_meat",
sources=["cython/raw_meat.pyx"],
# libraries=[""],
library_dirs=["src"],
include_dirs=["src"],
language='c++',
extra_compile_args=['-std=c++11','-Wno-unused-variable'],
extra_link_args=['-std=c++11'],
#extra_objects=["lib/parquet/libarrow.so.200.0.0"],
)
setup(
name="raw_meat",
ext_modules=cythonize(extensions)
)

View File

@ -5,78 +5,43 @@
#include "imc_datatype.hpp"
#include "imc_conversion.hpp"
#include "imc_block.hpp"
#include <sstream>
#include <math.h>
#include <chrono>
#include <ctime>
#include <time.h>
#if defined(__linux__) || defined(__APPLE__)
#include <iconv.h>
#elif defined(__WIN32__) || defined(_WIN32)
#define timegm _mkgmtime
#endif
//---------------------------------------------------------------------------//
namespace imc
{
struct component_env
{
std::string uuid_;
// required channel components for CG channels only
std::string CCuuid_, CPuuid_;
// optional channel components for CG channels only
std::string CDuuid_, NTuuid_;
std::string Cbuuid_, CRuuid_;
// reset all members
void reset()
{
uuid_.clear();
CCuuid_.clear();
CPuuid_.clear();
CDuuid_.clear();
Cbuuid_.clear();
CRuuid_.clear();
NTuuid_.clear();
}
};
// collect uuid's of blocks required for full channel reconstruction
struct channel_env
{
// define unique identifer for channel_env
std::string uuid_;
// collect common affiliate blocks for every channel
std::string NOuuid_, NLuuid_;
// collect affiliate blocks for a single channel
// channel types
std::string CBuuid_, CGuuid_, CIuuid_, CTuuid_;
std::string CNuuid_, CDuuid_, NTuuid_;
std::string CSuuid_;
component_env compenv1_;
component_env compenv2_;
std::string CBuuid_, CGuuid_, CCuuid_, CNuuid_;
std::string CDuuid_, CTuuid_, Cbuuid_, CPuuid_, CRuuid_, CSuuid_;
std::string NTuuid_, NOuuid_, NLuuid_;
// reset all members
void reset()
{
uuid_.clear();
NOuuid_.clear();
NLuuid_.clear();
CBuuid_.clear();
CGuuid_.clear();
CIuuid_.clear();
CTuuid_.clear();
CCuuid_.clear();
CNuuid_.clear();
CDuuid_.clear();
NTuuid_.clear();
CTuuid_.clear();
Cbuuid_.clear();
CPuuid_.clear();
CRuuid_.clear();
CSuuid_.clear();
compenv1_.reset();
compenv2_.reset();
NTuuid_.clear();
NOuuid_.clear();
NLuuid_.clear();
}
// get info
@ -84,23 +49,21 @@ namespace imc
{
std::stringstream ss;
ss<<std::setw(width)<<std::left<<"uuid:"<<uuid_<<"\n"
<<std::setw(width)<<std::left<<"NOuuid:"<<NOuuid_<<"\n"
<<std::setw(width)<<std::left<<"NLuuid:"<<NLuuid_<<"\n"
//
<<std::setw(width)<<std::left<<"CBuuid:"<<CBuuid_<<"\n"
<<std::setw(width)<<std::left<<"CGuuid:"<<CGuuid_<<"\n"
<<std::setw(width)<<std::left<<"CIuuid:"<<CIuuid_<<"\n"
<<std::setw(width)<<std::left<<"CTuuid:"<<CTuuid_<<"\n"
<<std::setw(width)<<std::left<<"CCuuid:"<<CCuuid_<<"\n"
<<std::setw(width)<<std::left<<"CNuuid:"<<CNuuid_<<"\n"
//
<<std::setw(width)<<std::left<<"CCuuid:"<<compenv1_.CCuuid_<<"\n"
<<std::setw(width)<<std::left<<"CPuuid:"<<compenv1_.CPuuid_<<"\n"
<<std::setw(width)<<std::left<<"CDuuid:"<<CDuuid_<<"\n"
<<std::setw(width)<<std::left<<"CTuuid:"<<CTuuid_<<"\n"
<<std::setw(width)<<std::left<<"Cbuuid:"<<Cbuuid_<<"\n"
<<std::setw(width)<<std::left<<"CPuuid:"<<CPuuid_<<"\n"
<<std::setw(width)<<std::left<<"CRuuid:"<<CRuuid_<<"\n"
<<std::setw(width)<<std::left<<"CSuuid:"<<CSuuid_<<"\n"
//
<<std::setw(width)<<std::left<<"CDuuid:"<<compenv1_.CDuuid_<<"\n"
<<std::setw(width)<<std::left<<"Cbuuid:"<<compenv1_.Cbuuid_<<"\n"
<<std::setw(width)<<std::left<<"CRuuid:"<<compenv1_.CRuuid_<<"\n"
<<std::setw(width)<<std::left<<"NTuuid:"<<compenv1_.NTuuid_<<"\n"
<<std::setw(width)<<std::left<<"CSuuid:"<<CSuuid_<<"\n";
<<std::setw(width)<<std::left<<"NTuuid:"<<NTuuid_<<"\n"
<<std::setw(width)<<std::left<<"NOuuid:"<<NOuuid_<<"\n"
<<std::setw(width)<<std::left<<"NLuuid:"<<NLuuid_<<"\n";
return ss.str();
}
@ -109,20 +72,19 @@ namespace imc
{
std::stringstream ss;
ss<<"{"<<"\"uuid\":\""<<uuid_
<<"\",\"NOuuid\":\""<<NOuuid_
<<"\",\"NLuuid\":\""<<NLuuid_
<<"\",\"CBuuid\":\""<<CBuuid_
<<"\",\"CGuuid\":\""<<CGuuid_
<<"\",\"CIuuid\":\""<<CIuuid_
<<"\",\"CTuuid\":\""<<CTuuid_
<<"\",\"CCuuid\":\""<<CCuuid_
<<"\",\"CNuuid\":\""<<CNuuid_
<<"\",\"CCuuid\":\""<<compenv1_.CCuuid_
<<"\",\"CPuuid\":\""<<compenv1_.CPuuid_
<<"\",\"CDuuid\":\""<<compenv1_.CDuuid_
<<"\",\"Cbuuid\":\""<<compenv1_.Cbuuid_
<<"\",\"CRuuid\":\""<<compenv1_.CRuuid_
<<"\",\"NTuuid\":\""<<compenv1_.NTuuid_
<<"\",\"CDuuid\":\""<<CDuuid_
<<"\",\"CTuuid\":\""<<CTuuid_
<<"\",\"Cbuuid\":\""<<Cbuuid_
<<"\",\"CPuuid\":\""<<CPuuid_
<<"\",\"CRuuid\":\""<<CRuuid_
<<"\",\"CSuuid\":\""<<CSuuid_
<<"\",\"NTuuid\":\""<<NTuuid_
<<"\",\"NOuuid\":\""<<NOuuid_
<<"\",\"NLuuid\":\""<<NLuuid_
<<"\"}";
return ss.str();
}
@ -146,8 +108,7 @@ namespace imc
std::string joinvec(std::vector<dt> myvec, unsigned long int limit = 10, int prec = 10, bool fixed = true)
{
// include entire list for limit = 0
unsigned long int myvecsize = (unsigned long int)myvec.size();
limit = (limit == 0) ? myvecsize : limit;
limit = (limit == 0) ? myvec.size() : limit;
std::stringstream ss;
ss<<"[";
@ -161,14 +122,14 @@ namespace imc
}
else
{
unsigned long int heals = limit/2;
unsigned long int heals = (unsigned long int)(limit/2.);
for ( unsigned long int i = 0; i < heals; i++ )
{
customize_stream(ss,prec,fixed);
ss<<myvec[i]<<",";
}
ss<<"...";
for ( unsigned long int i = myvecsize-heals; i < myvecsize; i++ )
for ( unsigned long int i = myvec.size()-heals; i < myvec.size(); i++ )
{
customize_stream(ss,prec,fixed);
ss<<myvec[i]<<",";
@ -180,131 +141,6 @@ namespace imc
return sumstr;
}
#if defined(__linux__) || defined(__APPLE__)
// convert encoding of any descriptions, channel-names, units etc.
class iconverter
{
std::string in_enc_, out_enc_;
iconv_t cd_;
size_t out_buffer_size_;
public:
iconverter(std::string in_enc, std::string out_enc, size_t out_buffer_size = 1024) :
in_enc_(in_enc), out_enc_(out_enc), out_buffer_size_(out_buffer_size)
{
// allocate descriptor for character set conversion
// (https://man7.org/linux/man-pages/man3/iconv_open.3.html)
cd_ = iconv_open(out_enc.c_str(), in_enc.c_str());
if ( (iconv_t)-1 == cd_ )
{
if ( errno == EINVAL )
{
std::string errmsg = std::string("The encoding conversion from ") + in_enc
+ std::string(" to ") + out_enc + std::string(" is not supported by the implementation.");
throw std::runtime_error(errmsg);
}
}
}
void convert(std::string &astring)
{
if ( astring.empty() ) return;
std::vector<char> in_buffer(astring.begin(),astring.end());
char *inbuf = &in_buffer[0];
size_t inbytes = in_buffer.size();
std::vector<char> out_buffer(out_buffer_size_);
char *outbuf = &out_buffer[0];
size_t outbytes = out_buffer.size();
// perform character set conversion
// ( - https://man7.org/linux/man-pages/man3/iconv.3.html
// - https://www.ibm.com/docs/en/zos/2.2.0?topic=functions-iconv-code-conversion )
while ( inbytes > 0 )
{
size_t res = iconv(cd_,&inbuf,&inbytes,&outbuf,&outbytes);
if ( (size_t)-1 == res )
{
std::string errmsg;
if ( errno == EILSEQ )
{
errmsg = std::string("An invalid multibyte sequence is encountered in the input.");
throw std::runtime_error(errmsg);
}
else if ( errno == EINVAL )
{
errmsg = std::string("An incomplete multibyte sequence is encountered in the input")
+ std::string(" and the input byte sequence terminates after it.");
}
else if ( errno == E2BIG )
{
errmsg = std::string("The output buffer has no more room for the next converted character.");
}
throw std::runtime_error(errmsg);
}
}
std::string outstring(out_buffer.begin(),out_buffer.end()-outbytes);
astring = outstring;
}
};
#elif defined(__WIN32__) || defined(_WIN32)
class iconverter
{
public:
iconverter(std::string in_enc, std::string out_enc, size_t out_buffer_size = 1024) {}
void convert(std::string &astring) {}
};
#endif
struct component_group
{
imc::component CC_;
imc::packaging CP_;
imc::abscissa CD_;
imc::buffer Cb_;
imc::range CR_;
imc::channelobj CN_;
imc::triggertime NT_;
component_env compenv_;
// Constructor to parse the associated blocks
component_group(component_env &compenv, std::map<std::string, imc::block>* blocks, std::vector<unsigned char>* buffer)
: compenv_(compenv)
{
if (blocks->count(compenv.CCuuid_) == 1)
{
CC_.parse(buffer, blocks->at(compenv.CCuuid_).get_parameters());
}
if (blocks->count(compenv.CPuuid_) == 1)
{
CP_.parse(buffer, blocks->at(compenv.CPuuid_).get_parameters());
}
if (blocks->count(compenv.CDuuid_) == 1)
{
CD_.parse(buffer, blocks->at(compenv.CDuuid_).get_parameters());
}
if (blocks->count(compenv.Cbuuid_) == 1)
{
Cb_.parse(buffer, blocks->at(compenv.Cbuuid_).get_parameters());
}
if (blocks->count(compenv.CRuuid_) == 1)
{
CR_.parse(buffer, blocks->at(compenv.CRuuid_).get_parameters());
}
if (blocks->count(compenv.NTuuid_) == 1)
{
NT_.parse(buffer, blocks->at(compenv.NTuuid_).get_parameters());
}
}
};
// channel
struct channel
{
@ -313,16 +149,8 @@ namespace imc
std::map<std::string,imc::block>* blocks_;
std::vector<unsigned char>* buffer_;
imc::origin_data NO_;
imc::language NL_;
imc::text CT_;
imc::groupobj CB_;
imc::datafield CG_;
imc::channelobj CN_;
// collect meta-data of channels according to env,
// just everything valueable in here
// TODO: is this necessary?
std::string uuid_;
std::string name_, comment_;
std::string origin_, origin_comment_, text_;
@ -331,330 +159,242 @@ namespace imc
std::string language_code_, codepage_;
std::string yname_, yunit_;
std::string xname_, xunit_;
double xstepwidth_, xstart_;
double xstepwidth_, xoffset_;
int xprec_;
int dimension_;
// buffer and data
int xsignbits_, xnum_bytes_;
int ysignbits_, ynum_bytes_;
int signbits_, num_bytes_;
// unsigned long int byte_offset_;
unsigned long int xbuffer_offset_, ybuffer_offset_;
unsigned long int xbuffer_size_, ybuffer_size_;
unsigned long int buffer_offset_, buffer_size_;
long int addtime_;
imc::numtype xdatatp_, ydatatp_;
std::vector<imc::datatype> xdata_, ydata_;
int datatp_;
imc::datatype dattyp_;
std::vector<imc::datatype> ydata_;
std::vector<double> xdata_;
// range, factor and offset
double xfactor_, yfactor_;
double xoffset_, yoffset_;
double factor_, offset_;
// group reference the channel belongs to
unsigned long int group_index_;
int group_index_;
std::string group_uuid_, group_name_, group_comment_;
// constructor takes channel's block environment
channel(channel_env &chnenv, std::map<std::string,imc::block>* blocks,
std::vector<unsigned char>* buffer):
chnenv_(chnenv), blocks_(blocks), buffer_(buffer),
xfactor_(1.), yfactor_(1.), xoffset_(0.), yoffset_(0.),
factor_(1.), offset_(0.),
group_index_(-1)
{
// declare list of block parameters
std::vector<imc::parameter> prms;
// use uuid from CN block
uuid_ = chnenv_.CNuuid_;
// extract associated NO data
if ( blocks_->count(chnenv_.NOuuid_) == 1 )
{
NO_.parse(buffer_, blocks_->at(chnenv_.NOuuid_).get_parameters());
origin_ = NO_.generator_;
comment_ = NO_.comment_;
}
// extract associated NL data
if ( blocks_->count(chnenv_.NLuuid_) == 1 )
{
NL_.parse(buffer_, blocks_->at(chnenv_.NLuuid_).get_parameters());
codepage_ = NL_.codepage_;
language_code_ = NL_.language_code_;
}
// extract associated CB data
if ( blocks_->count(chnenv_.CBuuid_) == 1 )
{
CB_.parse(buffer_, blocks_->at(chnenv_.CBuuid_).get_parameters());
prms = blocks_->at(chnenv_.CBuuid_).get_parameters();
group_index_ = std::stoi(blocks_->at(chnenv_.CBuuid_).get_parameter(prms[2]));
group_name_ = blocks_->at(chnenv_.CBuuid_).get_parameter(prms[4]);
group_comment_ = blocks_->at(chnenv_.CBuuid_).get_parameter(prms[6]);
}
// extract associated CT data
if ( blocks_->count(chnenv_.CTuuid_) == 1 )
{
CT_.parse(buffer_, blocks_->at(chnenv_.CTuuid_).get_parameters());
text_ = CT_.name_ + std::string(" - ")
+ CT_.text_ + std::string(" - ")
+ CT_.comment_;
prms = blocks_->at(chnenv_.CTuuid_).get_parameters();
text_ = blocks_->at(chnenv_.CTuuid_).get_parameter(prms[4]) + std::string(" - ")
+ blocks_->at(chnenv_.CTuuid_).get_parameter(prms[6]) + std::string(" - ")
+ blocks_->at(chnenv_.CTuuid_).get_parameter(prms[8]);
}
// extract associated CD data
if ( blocks_->count(chnenv_.CDuuid_) == 1 )
{
prms = blocks_->at(chnenv_.CDuuid_).get_parameters();
xstepwidth_ = std::stod(blocks_->at(chnenv_.CDuuid_).get_parameter(prms[2]));
xunit_ = blocks_->at(chnenv_.CDuuid_).get_parameter(prms[5]);
// TODO
// xname_ = std::string("time");
// find appropriate precision for "xdata_" by means of "xstepwidth_"
xprec_ = (xstepwidth_ > 0 ) ? (int)ceil(fabs(log10(xstepwidth_))) : 10;
}
// extract associated CP data
if ( blocks_->count(chnenv_.CPuuid_) == 1 )
{
prms = blocks_->at(chnenv_.CPuuid_).get_parameters();
num_bytes_ = std::stoi(blocks_->at(chnenv_.CPuuid_).get_parameter(prms[3]));
datatp_ = std::stoi(blocks_->at(chnenv_.CPuuid_).get_parameter(prms[4]));
signbits_ = std::stoi(blocks_->at(chnenv_.CPuuid_).get_parameter(prms[5]));
// byte_offset_ = std::stoul(blocks_->at(chnenv_.CPuuid_).get_parameter(prms[7]));
}
// extract associated Cb data
if ( blocks_->count(chnenv_.Cbuuid_) == 1 )
{
prms = blocks_->at(chnenv_.Cbuuid_).get_parameters();
buffer_offset_ = std::stoul(blocks_->at(chnenv_.Cbuuid_).get_parameter(prms[6]));
buffer_size_ = std::stoul(blocks_->at(chnenv_.Cbuuid_).get_parameter(prms[7]));
xoffset_ = std::stod(blocks_->at(chnenv_.Cbuuid_).get_parameter(prms[11]));
addtime_ = (long int)std::stod(blocks_->at(chnenv_.Cbuuid_).get_parameter(prms[12]));
}
// extract associated CR data
if ( blocks_->count(chnenv_.CRuuid_) == 1 )
{
prms = blocks_->at(chnenv_.CRuuid_).get_parameters();
factor_ = std::stod(blocks_->at(chnenv_.CRuuid_).get_parameter(prms[3]));
offset_ = std::stod(blocks_->at(chnenv_.CRuuid_).get_parameter(prms[4]));
yunit_ = blocks_->at(chnenv_.CRuuid_).get_parameter(prms[7]);
}
// extract associated CN data
if ( blocks_->count(chnenv_.CNuuid_) == 1 )
{
CN_.parse(buffer_, blocks_->at(chnenv_.CNuuid_).get_parameters());
group_index_ = CN_.group_index_;
group_name_ = CN_.name_;
group_comment_ = CN_.comment_;
prms = blocks_->at(chnenv_.CNuuid_).get_parameters();
name_ = blocks_->at(chnenv_.CNuuid_).get_parameter(prms[6]);
yname_ = name_;
comment_ = blocks_->at(chnenv_.CNuuid_).get_parameter(prms[8]);
// group_index_ = std::stoi(blocks_->at(chnenv_.CNuuid_).get_parameter(prms[2]));
}
if ( !chnenv_.compenv1_.uuid_.empty() && chnenv_.compenv2_.uuid_.empty() )
// extract associated NO data
if ( blocks_->count(chnenv_.NOuuid_) == 1 )
{
// normal dataset (single component)
// set common NT and CD keys if no others are specified
if (chnenv_.compenv1_.NTuuid_.empty()) chnenv_.compenv1_.NTuuid_ = chnenv_.NTuuid_;
if (chnenv_.compenv1_.CDuuid_.empty()) chnenv_.compenv1_.CDuuid_ = chnenv_.CDuuid_;
prms = blocks_->at(chnenv_.NOuuid_).get_parameters();
origin_ = blocks_->at(chnenv_.NOuuid_).get_parameter(prms[4]);
origin_comment_ = blocks_->at(chnenv_.NOuuid_).get_parameter(prms[6]);
}
// comp_group1 contains y-data, x-data is based on xstepwidth_, xstart_ and the length of y-data
component_group comp_group1(chnenv_.compenv1_, blocks_, buffer_);
dimension_ = 1;
// extract associated NL data
// codepage:
// - http://www.iana.org/assignments/character-sets/character-sets.xhtml
// - https://de.wikipedia.org/wiki/Zeichensatztabelle
// language-code:
// - https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-lcid/a9eac961-e77d-41a6-90a5-ce1a8b0cdb9c?redirectedfrom=MSDN
if ( blocks_->count(chnenv_.NLuuid_) == 1 )
{
prms = blocks_->at(chnenv_.NLuuid_).get_parameters();
codepage_ = blocks_->at(chnenv_.NLuuid_).get_parameter(prms[2]);
language_code_ = blocks_->at(chnenv_.NLuuid_).get_parameter(prms[3]);
}
// obtain NT data
// - https://en.cppreference.com/w/cpp/chrono/c/tm
// - https://en.cppreference.com/w/cpp/io/manip/put_time
if ( blocks_->count(chnenv_.NTuuid_) == 1 )
{
prms = blocks_->at(chnenv_.NTuuid_).get_parameters();
//std::tm tm{};
std::tm tms = std::tm();
tms.tm_mday = std::stoi(blocks_->at(chnenv_.NTuuid_).get_parameter(prms[2]));
tms.tm_mon = std::stoi(blocks_->at(chnenv_.NTuuid_).get_parameter(prms[3])) - 1;
tms.tm_year = std::stoi(blocks_->at(chnenv_.NTuuid_).get_parameter(prms[4])) - 1900;
tms.tm_hour = std::stoi(blocks_->at(chnenv_.NTuuid_).get_parameter(prms[5]));
tms.tm_min = std::stoi(blocks_->at(chnenv_.NTuuid_).get_parameter(prms[6]));
double secs = std::stold(blocks_->at(chnenv_.NTuuid_).get_parameter(prms[7]));
double secs_int;
trigger_time_frac_secs_ = modf(secs,&secs_int);
tms.tm_sec = (int)secs_int;
xstepwidth_ = comp_group1.CD_.dx_;
xunit_ = comp_group1.CD_.unit_;
ybuffer_offset_ = comp_group1.Cb_.offset_buffer_;
ybuffer_size_ = comp_group1.Cb_.number_bytes_;
xstart_ = comp_group1.Cb_.x0_;
yfactor_ = comp_group1.CR_.factor_;
yoffset_ = comp_group1.CR_.offset_;
yunit_ = comp_group1.CR_.unit_;
name_ = comp_group1.CN_.name_;
yname_ = comp_group1.CN_.name_;
comment_ = comp_group1.CN_.comment_;
ynum_bytes_ = comp_group1.CP_.bytes_;
ydatatp_ = comp_group1.CP_.numeric_type_;
ysignbits_ = comp_group1.CP_.signbits_;
// generate std::chrono::system_clock::time_point type
std::time_t ts = timegm(&comp_group1.NT_.tms_); // std::mktime(&tms);
std::time_t ts = std::mktime(&tms);
trigger_time_ = std::chrono::system_clock::from_time_t(ts);
trigger_time_frac_secs_ = comp_group1.NT_.trigger_time_frac_secs_;
// calculate absolute trigger-time
addtime_ = static_cast<long int>(comp_group1.Cb_.add_time_);
absolute_trigger_time_ = trigger_time_ + std::chrono::seconds(addtime_);
// + std::chrono::nanoseconds((long int)(trigger_time_frac_secs_*1.e9));
}
else if ( !chnenv_.compenv1_.uuid_.empty() && !chnenv_.compenv2_.uuid_.empty() )
{
// XY dataset (two components)
// set common NT and CD keys if no others are specified
if (chnenv_.compenv1_.NTuuid_.empty()) chnenv_.compenv1_.NTuuid_ = chnenv_.NTuuid_;
if (chnenv_.compenv1_.CDuuid_.empty()) chnenv_.compenv1_.CDuuid_ = chnenv_.CDuuid_;
if (chnenv_.compenv2_.NTuuid_.empty()) chnenv_.compenv2_.NTuuid_ = chnenv_.NTuuid_;
if (chnenv_.compenv2_.CDuuid_.empty()) chnenv_.compenv2_.CDuuid_ = chnenv_.CDuuid_;
// comp_group1 contains x-data, comp_group2 contains y-data
component_group comp_group1(chnenv_.compenv1_, blocks_, buffer_);
component_group comp_group2(chnenv_.compenv2_, blocks_, buffer_);
dimension_ = 2;
xbuffer_offset_ = comp_group2.Cb_.offset_buffer_;
xbuffer_size_ = comp_group2.Cb_.number_bytes_;
ybuffer_offset_ = comp_group1.Cb_.offset_buffer_;
ybuffer_size_ = comp_group1.Cb_.number_bytes_;
xfactor_ = comp_group2.CR_.factor_;
xoffset_ = comp_group2.CR_.offset_;
yfactor_ = comp_group1.CR_.factor_;
yoffset_ = comp_group1.CR_.offset_;
xdatatp_ = comp_group2.CP_.numeric_type_;
xsignbits_ = comp_group2.CP_.signbits_;
ydatatp_ = comp_group1.CP_.numeric_type_;
ysignbits_ = comp_group1.CP_.signbits_;
// generate std::chrono::system_clock::time_point type
std::time_t ts = timegm(&comp_group2.NT_.tms_); // std::mktime(&tms);
trigger_time_ = std::chrono::system_clock::from_time_t(ts);
trigger_time_frac_secs_ = comp_group2.NT_.trigger_time_frac_secs_;
absolute_trigger_time_ = trigger_time_;
}
else
{
// no datafield
}
// start converting binary buffer to imc::datatype
if ( !chnenv_.CSuuid_.empty() ) convert_buffer();
// convert any non-UTF-8 codepage to UTF-8 and cleanse any text
convert_encoding();
cleanse_text();
// calculate absolute trigger-time
absolute_trigger_time_ = trigger_time_ + std::chrono::seconds(addtime_);
// + std::chrono::nanoseconds((long int)(trigger_time_frac_secs_*1.e9));
}
// convert buffer to actual datatype
void convert_buffer()
{
// TODO no clue how/if/when to handle buffer offset/mask/subsequent_bytes
// etc. and whatever that shit is!
std::vector<imc::parameter> prms = blocks_->at(chnenv_.CSuuid_).get_parameters();
if ( prms.size() < 4)
{
throw std::runtime_error("CS block is invalid and features to few parameters");
}
// extract (channel dependent) part of buffer
unsigned long int buffstrt = prms[3].begin();
std::vector<unsigned char> yCSbuffer( buffer_->begin()+buffstrt+ybuffer_offset_+1,
buffer_->begin()+buffstrt+ybuffer_offset_+ybuffer_size_+1 );
std::vector<unsigned char> CSbuffer( buffer_->begin()+buffstrt+1,
buffer_->begin()+buffstrt+buffer_size_+1 );
// determine number of values in buffer
unsigned long int ynum_values = (unsigned long int)(yCSbuffer.size()/(ysignbits_/8));
if ( ynum_values*(ysignbits_/8) != yCSbuffer.size() )
unsigned long int num_values = (unsigned long int)(CSbuffer.size()/(signbits_/8));
if ( num_values*(signbits_/8) != CSbuffer.size() )
{
throw std::runtime_error("CSbuffer and significant bits of y datatype don't match");
throw std::runtime_error("CSbuffer and significant bits of datatype don't match");
}
// adjust size of ydata
ydata_.resize(num_values);
if (dimension_ == 1)
// distinguish numeric datatypes included in "imc_datatype"
if ( datatp_ == 1 )
{
// process y-data
process_data(ydata_, ynum_values, ydatatp_, yCSbuffer);
// find appropriate precision for "xdata_" by means of "xstepwidth_"
xprec_ = (xstepwidth_ > 0 ) ? (int)ceil(fabs(log10(xstepwidth_))) : 10;
// fill xdata_
for ( unsigned long int i = 0; i < ynum_values; i++ )
{
xdata_.push_back(xstart_+(double)i*xstepwidth_);
}
imc::convert_data_to_type<imc_Ubyte>(CSbuffer,ydata_);
}
else if (dimension_ == 2)
else if ( datatp_ == 2 )
{
// process x- and y-data
std::vector<unsigned char> xCSbuffer( buffer_->begin()+buffstrt+xbuffer_offset_+1,
buffer_->begin()+buffstrt+xbuffer_offset_+xbuffer_size_+1 );
// determine number of values in buffer
unsigned long int xnum_values = (unsigned long int)(xCSbuffer.size()/(xsignbits_/8));
if ( xnum_values*(xsignbits_/8) != xCSbuffer.size() )
{
throw std::runtime_error("CSbuffer and significant bits of x datatype don't match");
}
if ( xnum_values != ynum_values )
{
throw std::runtime_error("x and y data have different number of values");
}
xprec_ = 9;
process_data(xdata_, xnum_values, xdatatp_, xCSbuffer);
process_data(ydata_, ynum_values, ydatatp_, yCSbuffer);
imc::convert_data_to_type<imc_Sbyte>(CSbuffer,ydata_);
}
else if ( datatp_ == 3 )
{
imc::convert_data_to_type<imc_Ushort>(CSbuffer,ydata_);
}
else if ( datatp_ == 4 )
{
imc::convert_data_to_type<imc_Sshort>(CSbuffer,ydata_);
}
else if ( datatp_ == 5 )
{
imc::convert_data_to_type<imc_Ulongint>(CSbuffer,ydata_);
}
else if ( datatp_ == 6 )
{
imc::convert_data_to_type<imc_Slongint>(CSbuffer,ydata_);
}
else if ( datatp_ == 7 )
{
imc::convert_data_to_type<imc_float>(CSbuffer,ydata_);
}
else if ( datatp_ == 8 )
{
imc::convert_data_to_type<imc_double>(CSbuffer,ydata_);
}
// ...
else if ( datatp_ == 11 )
{
imc::convert_data_to_type<imc_digital>(CSbuffer,ydata_);
}
else
{
throw std::runtime_error("unsupported dimension");
throw std::runtime_error(std::string("unsupported/unknown datatype") + std::to_string(datatp_));
}
transformData(xdata_, xfactor_, xoffset_);
transformData(ydata_, yfactor_, yoffset_);
}
// handle data type conversion
void process_data(std::vector<imc::datatype>& data_, size_t num_values, numtype datatp_, std::vector<unsigned char>& CSbuffer)
{
// adjust size of data
data_.resize(num_values);
// handle data type conversion
switch (datatp_)
// fill xdata_
for ( unsigned long int i = 0; i < num_values; i++ )
{
case numtype::unsigned_byte:
imc::convert_data_to_type<imc_Ubyte>(CSbuffer, data_);
break;
case numtype::signed_byte:
imc::convert_data_to_type<imc_Sbyte>(CSbuffer, data_);
break;
case numtype::unsigned_short:
imc::convert_data_to_type<imc_Ushort>(CSbuffer, data_);
break;
case numtype::signed_short:
imc::convert_data_to_type<imc_Sshort>(CSbuffer, data_);
break;
case numtype::unsigned_long:
imc::convert_data_to_type<imc_Ulongint>(CSbuffer, data_);
break;
case numtype::signed_long:
imc::convert_data_to_type<imc_Slongint>(CSbuffer, data_);
break;
case numtype::ffloat:
imc::convert_data_to_type<imc_float>(CSbuffer, data_);
break;
case numtype::ddouble:
imc::convert_data_to_type<imc_double>(CSbuffer, data_);
break;
case numtype::two_byte_word_digital:
imc::convert_data_to_type<imc_digital>(CSbuffer, data_);
break;
case numtype::six_byte_unsigned_long:
imc::convert_data_to_type<imc_sixbyte>(CSbuffer, data_);
break;
default:
throw std::runtime_error(std::string("unsupported/unknown datatype ") + std::to_string(datatp_));
xdata_.push_back(xoffset_+i*xstepwidth_);
}
}
void transformData(std::vector<imc::datatype>& data, double factor, double offset) {
if (factor != 1.0 || offset != 0.0) {
for (imc::datatype& el : data) {
double fact = (factor == 0.0) ? 1.0 : factor;
el = imc::datatype(el.as_double() * fact + offset);
}
// employ data transformation
if ( factor_ != 1.0 || offset_ != 0.0 )
{
for ( imc::datatype& el: ydata_ )
{
// std::cout<<"value:"<<el.as_double()<<"\n";
el = imc::datatype(el.as_double()*factor_ + offset_);
}
}
// convert any description, units etc. to UTF-8 (by default)
void convert_encoding()
{
if ( !codepage_.empty() )
{
// construct iconv-compatible name for respective codepage
std::string cpn = std::string("CP") + codepage_;
// set up converter
std::string utf = std::string("UTF-8");
iconverter conv(cpn,utf);
conv.convert(name_);
conv.convert(comment_);
conv.convert(origin_);
conv.convert(origin_comment_);
conv.convert(text_);
conv.convert(language_code_);
conv.convert(yname_);
conv.convert(yunit_);
conv.convert(xname_);
conv.convert(xunit_);
conv.convert(group_name_);
conv.convert(group_comment_);
}
}
void cleanse_text()
{
escape_backslash(name_);
escape_backslash(comment_);
escape_backslash(origin_);
escape_backslash(origin_comment_);
escape_backslash(text_);
escape_backslash(language_code_);
escape_backslash(yname_);
escape_backslash(yunit_);
escape_backslash(xname_);
escape_backslash(xunit_);
escape_backslash(group_name_);
escape_backslash(group_comment_);
}
void escape_backslash(std::string &text)
{
char backslash = 0x5c;
std::string doublebackslash("\\\\");
for ( std::string::iterator it = text.begin(); it != text.end(); ++it )
{
if ( int(*it) == backslash ) {
text.replace(it,it+1,doublebackslash);
++it;
}
}
}
@ -662,7 +402,7 @@ namespace imc
std::string get_info(int width = 20)
{
// prepare printable trigger-time
std::time_t tt = std::chrono::system_clock::to_time_t(trigger_time_);
//std::time_t tt = std::chrono::system_clock::to_time_t(trigger_time_);
std::time_t att = std::chrono::system_clock::to_time_t(absolute_trigger_time_);
std::stringstream ss;
@ -670,28 +410,28 @@ namespace imc
<<std::setw(width)<<std::left<<"name:"<<name_<<"\n"
<<std::setw(width)<<std::left<<"comment:"<<comment_<<"\n"
<<std::setw(width)<<std::left<<"origin:"<<origin_<<"\n"
<<std::setw(width)<<std::left<<"origin-comment:"<<origin_comment_<<"\n"
<<std::setw(width)<<std::left<<"description:"<<text_<<"\n"
<<std::setw(width)<<std::left<<"trigger-time-nt:"<<std::put_time(std::gmtime(&tt),"%FT%T")<<"\n"
<<std::setw(width)<<std::left<<"trigger-time:"<<std::put_time(std::gmtime(&att),"%FT%T")<<"\n"
//<<std::setw(width)<<std::left<<"trigger-time:"<<std::put_time(std::localtime(&tt),"%FT%T")<<"\n"
<<std::setw(width)<<std::left<<"trigger-time:"<<std::put_time(std::localtime(&att),"%FT%T")<<"\n"
<<std::setw(width)<<std::left<<"language-code:"<<language_code_<<"\n"
<<std::setw(width)<<std::left<<"codepage:"<<codepage_<<"\n"
<<std::setw(width)<<std::left<<"yname:"<<yname_<<"\n"
<<std::setw(width)<<std::left<<"yunit:"<<yunit_<<"\n"
<<std::setw(width)<<std::left<<"datatype:"<<ydatatp_<<"\n"
<<std::setw(width)<<std::left<<"significant bits:"<<ysignbits_<<"\n"
<<std::setw(width)<<std::left<<"buffer-offset:"<<ybuffer_offset_<<"\n"
<<std::setw(width)<<std::left<<"buffer-size:"<<ybuffer_size_<<"\n"
<<std::setw(width)<<std::left<<"datatype:"<<datatp_<<"\n"
<<std::setw(width)<<std::left<<"significant bits:"<<signbits_<<"\n"
<<std::setw(width)<<std::left<<"buffer-offset:"<<buffer_offset_<<"\n"
<<std::setw(width)<<std::left<<"buffer-size:"<<buffer_size_<<"\n"
//<<std::setw(width)<<std::left<<"add-time:"<<addtime_<<"\n"
<<std::setw(width)<<std::left<<"xname:"<<xname_<<"\n"
<<std::setw(width)<<std::left<<"xunit:"<<xunit_<<"\n"
<<std::setw(width)<<std::left<<"xstepwidth:"<<xstepwidth_<<"\n"
<<std::setw(width)<<std::left<<"xoffset:"<<xstart_<<"\n"
<<std::setw(width)<<std::left<<"factor:"<<yfactor_<<"\n"
<<std::setw(width)<<std::left<<"offset:"<<yoffset_<<"\n"
<<std::setw(width)<<std::left<<"xoffset:"<<xoffset_<<"\n"
<<std::setw(width)<<std::left<<"factor:"<<factor_<<"\n"
<<std::setw(width)<<std::left<<"offset:"<<offset_<<"\n"
<<std::setw(width)<<std::left<<"group:"<<"("<<group_index_<<","<<group_name_
<<","<<group_comment_<<")"<<"\n"
<<std::setw(width)<<std::left<<"ydata:"<<imc::joinvec<imc::datatype>(ydata_,6,9,true)<<"\n"
<<std::setw(width)<<std::left<<"xdata:"<<imc::joinvec<imc::datatype>(xdata_,6,xprec_,true)<<"\n";
<<std::setw(width)<<std::left<<"xdata:"<<imc::joinvec<double>(xdata_,6,xprec_,true)<<"\n";
// <<std::setw(width)<<std::left<<"aff. blocks:"<<chnenv_.get_json()<<"\n";
return ss.str();
}
@ -700,7 +440,7 @@ namespace imc
std::string get_json(bool include_data = false)
{
// prepare printable trigger-time
std::time_t tt = std::chrono::system_clock::to_time_t(trigger_time_);
//std::time_t tt = std::chrono::system_clock::to_time_t(trigger_time_);
std::time_t att = std::chrono::system_clock::to_time_t(absolute_trigger_time_);
std::stringstream ss;
@ -708,27 +448,24 @@ namespace imc
<<"\",\"name\":\""<<name_
<<"\",\"comment\":\""<<comment_
<<"\",\"origin\":\""<<origin_
<<"\",\"origin-comment\":\""<<origin_comment_
<<"\",\"description\":\""<<text_
<<"\",\"trigger-time-nt\":\""<<std::put_time(std::gmtime(&tt),"%FT%T")
<<"\",\"trigger-time\":\""<<std::put_time(std::gmtime(&att),"%FT%T")
<<"\",\"trigger-time\":\""<<std::put_time(std::localtime(&att),"%FT%T")
<<"\",\"language-code\":\""<<language_code_
<<"\",\"codepage\":\""<<codepage_
<<"\",\"yname\":\""<<prepjsonstr(yname_)
<<"\",\"yunit\":\""<<prepjsonstr(yunit_)
<<"\",\"significantbits\":\""<<ysignbits_
<<"\",\"buffer-size\":\""<<ybuffer_size_
<<"\",\"xname\":\""<<prepjsonstr(xname_)
<<"\",\"xunit\":\""<<prepjsonstr(xunit_)
<<"\",\"yname\":\""<<yname_
<<"\",\"yunit\":\""<<yunit_
<<"\",\"significantbits\":\""<<signbits_
<<"\",\"xname\":\""<<xname_
<<"\",\"xunit\":\""<<xunit_
<<"\",\"xstepwidth\":\""<<xstepwidth_
<<"\",\"xoffset\":\""<<xstart_
<<"\",\"xoffset\":\""<<xoffset_
<<"\",\"group\":{"<<"\"index\":\""<<group_index_
<<"\",\"name\":\""<<group_name_
<<"\",\"comment\":\""<<group_comment_<<"\""<<"}";
if ( include_data )
{
ss<<",\"ydata\":"<<imc::joinvec<imc::datatype>(ydata_,0,9,true)
<<",\"xdata\":"<<imc::joinvec<imc::datatype>(xdata_,0,xprec_,true);
<<",\"xdata\":"<<imc::joinvec<double>(xdata_,0,xprec_,true);
}
// ss<<"\",\"aff. blocks\":\""<<chnenv_.get_json()
ss<<"}";
@ -736,25 +473,6 @@ namespace imc
return ss.str();
}
// prepare string value for usage in JSON dump
std::string prepjsonstr(std::string value)
{
std::stringstream ss;
ss<<quoted(value);
return strip_quotes(ss.str());
}
// remove any leading or trailing double quotes
std::string strip_quotes(std::string astring)
{
// head
if ( astring.front() == '"' ) astring.erase(astring.begin()+0);
// tail
if ( astring.back() == '"' ) astring.erase(astring.end()-1);
return astring;
}
// print channel
void print(std::string filename, const char sep = ',', int width = 25, int yprec = 9)
{

View File

@ -16,12 +16,12 @@ namespace imc
//
// e.g. ARM Cortex-A72 armv7l gcc version 10.2.0 (Ubuntu 10.2.0-13ubuntu1)
// #ifdef __arm__
// typedef unsigned long int imc_Ulongint;
// typedef signed long int imc_Slongint;
typedef unsigned long int imc_Ulongint;
typedef signed long int imc_Slongint;
// e.g. Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz x86_64 gcc version 10.2.0 (Ubuntu 10.2.0-13ubuntu1)
// #ifdef i386 __i386 __i386__
typedef unsigned int imc_Ulongint;
typedef signed int imc_Slongint;
// typedef unsigned int imc_Ulongint;
// typedef signed int imc_Slongint;
//
typedef float imc_float;
typedef double imc_double;
@ -30,11 +30,7 @@ namespace imc
// typedef <whatever that is ->... > "imc Devices Transitional Recording"
// typedf <sometimestamptype> "Timestamp Ascii"
typedef char16_t imc_digital;
//
typedef struct {
unsigned char bytes[6];
} imc_sixbyte;
// typedef < > imc_sixbyte "6byte unsigned long"
class datatype
{
@ -48,14 +44,13 @@ namespace imc
imc_float sfloat_; // 6
imc_double sdouble_; // 7
imc_digital sdigital_; // 10
imc_sixbyte sixbyte_; // 13
short int dtidx_; // \in \{0,...,7,10,13\}
short int dtidx_; // \in \{0,...,7,10\}
public:
datatype(): ubyte_(0), sbyte_(0),
ushort_(0), sshort_(0),
ulint_(0), slint_(0),
sfloat_(0.0), sdouble_(0.0),
sdigital_(0), sixbyte_({0}),
sdigital_(0),
dtidx_(0) { };
// every supported datatype gets its own constructor
datatype(imc_Ubyte num): ubyte_(num), dtidx_(0) {};
@ -67,13 +62,10 @@ namespace imc
datatype(imc_float num): sfloat_(num), dtidx_(6) {};
datatype(imc_double num): ubyte_(0), sbyte_(0), ushort_(0), sshort_(0),
ulint_(0), slint_(0), sfloat_(0.0), sdouble_(num),
sdigital_(0), sixbyte_({0}), dtidx_(7) {};
sdigital_(0), dtidx_(7) {};
datatype(imc_digital num): ubyte_(0), sbyte_(0), ushort_(0), sshort_(0),
ulint_(0), slint_(0), sfloat_(0.0), sdouble_(0.0),
sdigital_(num), sixbyte_({0}), dtidx_(10) {};
datatype(imc_sixbyte num): ubyte_(0), sbyte_(0), ushort_(0), sshort_(0),
ulint_(0), slint_(0), sfloat_(0.0), sdouble_(0.0),
sdigital_(0), sixbyte_(num), dtidx_(13) {};
ulint_(0), slint_(0), sfloat_(0.0), sdouble_(num),
sdigital_(num), dtidx_(10) {};
// identify type
short int& dtype() { return dtidx_; }
@ -90,7 +82,6 @@ namespace imc
this->sfloat_ = num.sfloat_;
this->sdouble_ = num.sdouble_;
this->sdigital_ = num.sdigital_;
this->sixbyte_ = num.sixbyte_;
this->dtidx_ = num.dtidx_;
}
@ -108,7 +99,6 @@ namespace imc
this->sfloat_ = num.sfloat_;
this->sdouble_ = num.sdouble_;
this->sdigital_ = num.sdigital_;
this->sixbyte_ = num.sixbyte_;
this->dtidx_ = num.dtidx_;
}
@ -170,12 +160,6 @@ namespace imc
this->dtidx_ = 10;
return *this;
}
datatype& operator=(const imc_sixbyte &num)
{
this->sixbyte_ = num;
this->dtidx_ = 13;
return *this;
}
// obtain number as double
double as_double()
@ -190,13 +174,6 @@ namespace imc
else if ( dtidx_ == 6 ) num = (double)sfloat_;
else if ( dtidx_ == 7 ) num = (double)sdouble_;
else if ( dtidx_ == 10 ) num = static_cast<double>(sdigital_);
else if ( dtidx_ == 13 ) {
unsigned long long value = 0;
for (int i = 0; i < 6; ++i) {
value |= static_cast<unsigned long long>(sixbyte_.bytes[i]) << (8 * i);
}
num = static_cast<double>(value);
}
return num;
}
@ -212,13 +189,6 @@ namespace imc
else if ( num.dtidx_ == 6 ) out<<num.sfloat_;
else if ( num.dtidx_ == 7 ) out<<num.sdouble_;
else if ( num.dtidx_ == 10 ) out<<static_cast<double>(num.sdigital_);
else if ( num.dtidx_ == 13 ) {
unsigned long long value = 0;
for (int i = 0; i < 6; ++i) {
value |= static_cast<unsigned long long>(num.sixbyte_.bytes[i]) << (8 * i);
}
out<<static_cast<double>(value);
}
return out;
}

View File

@ -84,7 +84,6 @@ namespace imc
// noncritical keys
key(false,"NO","origin of data",1),
key(false,"NT","timestamp of trigger",1),
key(false,"NT","timestamp of trigger",2),
key(false,"ND","(color) display properties",1),
key(false,"NU","user defined key",1),
key(false,"Np","property of channel",1),

View File

@ -4,7 +4,6 @@
#define IMCOBJECT
#include <time.h>
#include <math.h>
#include "imc_key.hpp"
//---------------------------------------------------------------------------//
@ -246,7 +245,7 @@ namespace imc
// construct members by parsing particular parameters from buffer
void parse(const std::vector<unsigned char>* buffer, const std::vector<parameter>& parameters)
{
if ( parameters.size() < 4 ) throw std::runtime_error("invalid number of parameters in CC");
if ( parameters.size() < 4 ) throw std::runtime_error("invalid number of parameters in CD2");
component_index_ = std::stoi(get_parameter(buffer,&parameters[2]));
analog_digital_ = ( std::stoi(get_parameter(buffer,&parameters[3])) == 2 );
}
@ -273,9 +272,7 @@ namespace imc
imc_devices_transitional_recording,
timestamp_ascii,
two_byte_word_digital,
eight_byte_unsigned_long,
six_byte_unsigned_long,
eight_byte_signed_long
six_byte_unsigned_long
};
// packaging information of component (corresponds to key CP)
@ -295,8 +292,8 @@ namespace imc
{
if ( parameters.size() < 10 ) throw std::runtime_error("invalid number of parameters in CP");
buffer_reference_ = std::stoi(get_parameter(buffer,&parameters[2]));
bytes_ = std::stoi(get_parameter(buffer,&parameters[3]));
numeric_type_ = (numtype)std::stoi(get_parameter(buffer,&parameters[4]));
numeric_type_ = (numtype)std::stoi(get_parameter(buffer,&parameters[3]));
bytes_ = std::stoi(get_parameter(buffer,&parameters[4]));
signbits_ = std::stoi(get_parameter(buffer,&parameters[5]));
mask_ = std::stoi(get_parameter(buffer,&parameters[6]));
offset_ = std::stoul(get_parameter(buffer,&parameters[7]));
@ -339,7 +336,7 @@ namespace imc
// construct members by parsing particular parameters from buffer
void parse(const std::vector<unsigned char>* buffer, const std::vector<parameter>& parameters)
{
if ( parameters.size() < 13 ) throw std::runtime_error("invalid number of parameters in Cb");
if ( parameters.size() < 13 ) throw std::runtime_error("invalid number of parameters in CD2");
number_buffers_ = std::stoul(get_parameter(buffer,&parameters[2]));
bytes_userinfo_ = std::stoul(get_parameter(buffer,&parameters[3]));
buffer_reference_ = std::stoul(get_parameter(buffer,&parameters[4]));
@ -381,7 +378,7 @@ namespace imc
// construct members by parsing particular parameters from buffer
void parse(const std::vector<unsigned char>* buffer, const std::vector<parameter>& parameters)
{
if ( parameters.size() < 8 ) throw std::runtime_error("invalid number of parameters in CR");
if ( parameters.size() < 8 ) throw std::runtime_error("invalid number of parameters in CD2");
transform_ = (get_parameter(buffer,&parameters[2]) == std::string("1"));
factor_ = std::stod(get_parameter(buffer,&parameters[3]));
offset_ = std::stod(get_parameter(buffer,&parameters[4]));
@ -413,7 +410,7 @@ namespace imc
// construct members by parsing particular parameters from buffer
void parse(const std::vector<unsigned char>* buffer, const std::vector<parameter>& parameters)
{
if ( parameters.size() < 9 ) throw std::runtime_error("invalid number of parameters in CN");
if ( parameters.size() < 9 ) throw std::runtime_error("invalid number of parameters in CD2");
group_index_ = std::stoul(get_parameter(buffer,&parameters[2]));
index_bit_ = (get_parameter(buffer,&parameters[4]) == std::string("1"));
name_ = get_parameter(buffer,&parameters[6]);
@ -442,7 +439,7 @@ namespace imc
// construct members by parsing particular parameters from buffer
void parse(const std::vector<unsigned char>* buffer, const std::vector<parameter>& parameters)
{
if ( parameters.size() < 4 ) throw std::runtime_error("invalid number of parameters in CS");
if ( parameters.size() < 4 ) throw std::runtime_error("invalid number of parameters in CD2");
index_ = std::stoul(get_parameter(buffer,&parameters[2]));
}
@ -457,21 +454,6 @@ namespace imc
}
};
// language (corresponds to key NL)
struct language
{
std::string codepage_;
std::string language_code_;
// construct members by parsing particular parameters from buffer
void parse(const std::vector<unsigned char>* buffer, const std::vector<parameter>& parameters)
{
if (parameters.size() < 4) throw std::runtime_error("invalid number of parameters in NL");
codepage_ = get_parameter(buffer, &parameters[2]);
language_code_ = get_parameter(buffer, &parameters[3]);
}
};
// origin of data (corresponds to key NO)
struct origin_data
{
@ -482,7 +464,7 @@ namespace imc
// construct members by parsing particular parameters from buffer
void parse(const std::vector<unsigned char>* buffer, const std::vector<parameter>& parameters)
{
if ( parameters.size() < 7 ) throw std::runtime_error("invalid number of parameters in NO");
if ( parameters.size() < 7 ) throw std::runtime_error("invalid number of parameters in CD2");
origin_ = ( get_parameter(buffer,&parameters[2]) == std::string("1") );
generator_ = get_parameter(buffer,&parameters[4]);
comment_ = get_parameter(buffer,&parameters[6]);
@ -502,30 +484,45 @@ namespace imc
// trigger timestamp (corresponds to key NT1)
struct triggertime
{
std::tm tms_;
double trigger_time_frac_secs_;
int day_, month_, year_;
int hour_, minute_;
double second_;
std::string timestamp_;
// construct members by parsing particular parameters from buffer
void parse(const std::vector<unsigned char>* buffer, const std::vector<parameter>& parameters)
{
if ( parameters.size() < 8 ) throw std::runtime_error("invalid number of parameters in NT1");
tms_ = std::tm();
tms_.tm_mday = std::stoi( get_parameter(buffer,&parameters[2]) );
tms_.tm_mon = std::stoi( get_parameter(buffer,&parameters[3]) ) - 1;
tms_.tm_year = std::stoi( get_parameter(buffer,&parameters[4]) ) - 1900;
tms_.tm_hour = std::stoi( get_parameter(buffer,&parameters[5]) );
tms_.tm_min = std::stoi( get_parameter(buffer,&parameters[6]) );
long double secs = std::stold( get_parameter(buffer,&parameters[7]) );
double secs_int;
trigger_time_frac_secs_ = modf((double)secs,&secs_int);
tms_.tm_sec = (int)secs_int;
if ( parameters.size() < 8 ) throw std::runtime_error("invalid number of parameters in CD2");
day_ = std::stoi( get_parameter(buffer,&parameters[2]) );
month_ = std::stoi( get_parameter(buffer,&parameters[3]) );
year_ = std::stoi( get_parameter(buffer,&parameters[4]) );
hour_ = std::stoi( get_parameter(buffer,&parameters[5]) );
minute_ = std::stoi( get_parameter(buffer,&parameters[6]) );
second_ = std::stod( get_parameter(buffer,&parameters[7]) );
//time_t rawtime;
//struct tm ts;
//time(&rawtime);
//localtime_r(&rawtime,&ts);
//ts.tm_mday = day_;
//ts.tm_mon = month_-1;
//ts.tm_year = year_-1900;
//ts.tm_hour = hour_;
//ts.tm_min = minute_;
//ts.tm_sec = (int)second_;
//asctime_r(&ts,&timestamp_[0]);
timestamp_ = std::to_string(year_) + std::string("-") + std::to_string(month_)
+ std::string("-") + std::to_string(day_)
+ std::string("T") + std::to_string(hour_)
+ std::string(":") + std::to_string(minute_)
+ std::string(":") + std::to_string(second_);
}
// get info string
std::string get_info(int width = 20)
{
std::stringstream ss;
ss<<std::setw(width)<<std::left<<"timestamp:"<<std::put_time(&tms_, "%Y-%m-%dT%H:%M:%S")<<"\n";
ss<<std::setw(width)<<std::left<<"timestamp:"<<timestamp_<<"\n";
return ss.str();
}
};

View File

@ -58,8 +58,6 @@ namespace imc
// open file and stream data into buffer
void fill_buffer()
{
buffer_.clear();
// open file and put data in buffer
try {
std::ifstream fin(raw_file_.c_str(),std::ifstream::binary);
@ -78,8 +76,6 @@ namespace imc
// parse all raw blocks in buffer
void parse_blocks()
{
rawblocks_.clear();
// reset counter to identify computational complexity
cplxcnt_ = 0;
@ -194,8 +190,6 @@ namespace imc
// generate map of blocks using their uuid
void generate_block_map()
{
mapblocks_.clear();
for ( imc::block blk: rawblocks_ )
{
mapblocks_.insert( std::pair<std::string,imc::block>(blk.get_uuid(),blk) );
@ -205,82 +199,36 @@ namespace imc
// generate channel "environments"
void generate_channel_env()
{
channels_.clear();
// declare single channel environment
imc::channel_env chnenv;
chnenv.reset();
imc::component_env *compenv_ptr = nullptr;
// collect affiliate blocks for every channel WITH CHANNEL and AFFILIATE
// BLOCK CORRESPONDENCE GOVERNED BY BLOCK ORDER IN BUFFER!!
for ( imc::block blk: rawblocks_ )
{
if ( blk.get_key().name_ == "NO" ) chnenv.NOuuid_ = blk.get_uuid();
if ( blk.get_key().name_ == "CN" ) chnenv.CNuuid_ = blk.get_uuid();
else if ( blk.get_key().name_ == "CD" ) chnenv.CDuuid_ = blk.get_uuid();
else if ( blk.get_key().name_ == "CT" ) chnenv.CTuuid_ = blk.get_uuid();
else if ( blk.get_key().name_ == "Cb" ) chnenv.Cbuuid_ = blk.get_uuid();
else if ( blk.get_key().name_ == "CP" ) chnenv.CPuuid_ = blk.get_uuid();
else if ( blk.get_key().name_ == "CR" ) chnenv.CRuuid_ = blk.get_uuid();
else if ( blk.get_key().name_ == "CS" ) chnenv.CSuuid_ = blk.get_uuid();
else if ( blk.get_key().name_ == "NT" ) chnenv.NTuuid_ = blk.get_uuid();
else if ( blk.get_key().name_ == "NO" ) chnenv.NOuuid_ = blk.get_uuid();
else if ( blk.get_key().name_ == "NL" ) chnenv.NLuuid_ = blk.get_uuid();
else if ( blk.get_key().name_ == "CB" ) chnenv.CBuuid_ = blk.get_uuid();
else if ( blk.get_key().name_ == "CG" ) chnenv.CGuuid_ = blk.get_uuid();
else if ( blk.get_key().name_ == "CI" ) chnenv.CIuuid_ = blk.get_uuid();
else if ( blk.get_key().name_ == "CT" ) chnenv.CTuuid_ = blk.get_uuid();
else if ( blk.get_key().name_ == "CN" ) chnenv.CNuuid_ = blk.get_uuid();
else if ( blk.get_key().name_ == "CS" ) chnenv.CSuuid_ = blk.get_uuid();
else if ( blk.get_key().name_ == "CC" )
{
// a new component group is started
// TODO: can we avoid to parse the whole component here?
imc::component component;
component.parse(&buffer_, blk.get_parameters());
if ( component.component_index_ == 1 ) compenv_ptr = &chnenv.compenv1_;
else if ( component.component_index_ == 2 ) compenv_ptr = &chnenv.compenv2_;
else throw std::runtime_error("invalid component index in CC block");
compenv_ptr->CCuuid_ = blk.get_uuid();
compenv_ptr->uuid_ = compenv_ptr->CCuuid_;
}
else if ( blk.get_key().name_ == "CD" )
{
if (compenv_ptr == nullptr) chnenv.CDuuid_ = blk.get_uuid();
else compenv_ptr->CDuuid_ = blk.get_uuid();
}
else if ( blk.get_key().name_ == "NT" )
{
if (compenv_ptr == nullptr) chnenv.NTuuid_ = blk.get_uuid();
else compenv_ptr->NTuuid_ = blk.get_uuid();
}
else if ( blk.get_key().name_ == "Cb" ) compenv_ptr->Cbuuid_ = blk.get_uuid();
else if ( blk.get_key().name_ == "CP" ) compenv_ptr->CPuuid_ = blk.get_uuid();
else if ( blk.get_key().name_ == "CR" ) compenv_ptr->CRuuid_ = blk.get_uuid();
// check for currently associated channel
// TODO: CNuuid is not unique for multichannel data
if ( !chnenv.CNuuid_.empty() )
{
// at the moment only a single channel is supported
// any channel is closed by any of {CB, CG, CI, CT, CS}
if ( blk.get_key().name_ == "CB" || blk.get_key().name_ == "CG"
|| blk.get_key().name_ == "CI" || blk.get_key().name_ == "CT"
|| blk.get_key().name_ == "CS" )
// any component/channel is closed by any of {CS, CC, CG, CB}
if ( blk.get_key().name_ == "CS" || blk.get_key().name_ == "CC"
|| blk.get_key().name_ == "CG" || blk.get_key().name_ == "CB" )
{
// provide UUID for channel
// for multi component channels exactly one CN is available
chnenv.uuid_ = chnenv.CNuuid_;
// for multichannel data there may be multiple channels referring to
// the same (final) CS block (in contrast to what the IMC software
// documentation seems to suggest) resulting in all channels missing
// a CS block except for the very last
if ( chnenv.CSuuid_.empty() ) {
for ( imc::block blkCS: rawblocks_ ) {
if ( blkCS.get_key().name_ == "CS"
&& blkCS.get_begin() > (unsigned long int)stol(chnenv.uuid_) ) {
chnenv.CSuuid_ = blkCS.get_uuid();
}
}
}
// create channel object and add it to the map of channels
channels_.insert( std::pair<std::string,imc::channel>
(chnenv.CNuuid_,imc::channel(chnenv,&mapblocks_,&buffer_))
@ -288,14 +236,6 @@ namespace imc
// reset channel uuid
chnenv.CNuuid_.clear();
chnenv.CBuuid_.clear();
chnenv.CGuuid_.clear();
chnenv.CIuuid_.clear();
chnenv.CTuuid_.clear();
chnenv.CSuuid_.clear();
compenv_ptr = nullptr;
}
}
@ -303,11 +243,11 @@ namespace imc
// already belong to NEXT component
if ( blk.get_key().name_ == "CB" ) chnenv.CBuuid_ = blk.get_uuid();
else if ( blk.get_key().name_ == "CG" ) chnenv.CGuuid_ = blk.get_uuid();
else if ( blk.get_key().name_ == "CI" ) chnenv.CIuuid_ = blk.get_uuid();
else if ( blk.get_key().name_ == "CT" ) chnenv.CTuuid_ = blk.get_uuid();
else if ( blk.get_key().name_ == "CC" ) chnenv.CCuuid_ = blk.get_uuid();
}
}
public:
// provide buffer size

View File

@ -8,25 +8,22 @@ EXE = imctermite
# directory names
SRC = src/
LIB = lib/
CYT = cython/
PYT = python/
# list headers and include directories
# list headers
HPP = $(wildcard $(LIB)/*.hpp)
IPP = $(shell find $(LIB) -type f -name '*.hpp')
KIB = $(shell find $(LIB) -type d)
MIB = $(foreach dir,$(KIB),-I $(dir))
# choose compiler and its options
CC = g++ -std=c++17
OPT = -O3 -Wall -Wconversion -Wpedantic -Werror -Wunused-variable -Wsign-compare
#OPT = -O3 -Wall -mavx -mno-tbm -mf16c -mno-f16c
OPT = -O3 -Wall -Werror -Wunused-variable -Wsign-compare
# determine git version/commit and release tag
GTAG := $(shell git tag -l --sort=version:refname | tail -n1 | sed "s/$^v//g")
GTAG := $(shell git tag | tail -n1)
GHSH := $(shell git rev-parse HEAD | head -c8)
GVSN := $(shell cat python/VERSION | tr -d ' \n')
# current timestamp
TMS = $(shell date +%Y%m%dT%H%M%S)
RTAG := v$(shell cat pip/setup.py | grep version | grep -oP "([0-9]\.){2}[0-9]")
CTAG := v$(shell cat cython/setup.py | grep version | grep -oP "([0-9]\.){2}[0-9]")
# define install location
INST := /usr/local/bin
@ -34,50 +31,50 @@ INST := /usr/local/bin
#-----------------------------------------------------------------------------#
# C++ and CLI tool
# build executable
$(EXE): check-tags $(GVSN) main.o
# build exectuable
$(EXE) : check-vtag $(RTAG) main.o
$(CC) $(OPT) main.o -o $@
# build main.cpp and include git version/commit tag
main.o: src/main.cpp $(IPP)
main.o : src/main.cpp $(HPP)
@cp $< $<.cpp
@sed -i 's/TAGSTRING/$(GTAG)/g' $<.cpp
@sed -i 's/HASHSTRING/$(GHSH)/g' $<.cpp
@sed -i 's/TIMESTAMPSTRING/$(TMS)/g' $<.cpp
$(CC) -c $(OPT) $(MIB) $<.cpp -o $@
$(CC) -c $(OPT) -I $(LIB) $<.cpp -o $@
@rm $<.cpp
install: $(EXE)
install : $(EXE)
cp $< $(INST)/
uninstall: $(INST)/$(EXE)
uninstall : $(INST)/$(EXE)
rm $<
cpp-clean:
cpp-clean :
rm -vf $(EXE)
rm -vf *.o
#-----------------------------------------------------------------------------#
# C++ linter
# linter and code check
check-code:
cppcheck --enable=all -I lib/ src/main.cpp
#-----------------------------------------------------------------------------#
# versions
# check version consistency of git tags and version string in package.json
$(GTAG):
$(GTAG) :
@echo "consistent versions check successful: building $(GTAG)"
check-tags:
@echo "latest git tag: $(GTAG)"
@echo "latest git hash: $(GHSH)"
@echo "python version: $(GVSN)"
check-vtag:
@echo "git tag version: "$(GTAG)
@echo "git commit hash: "$(GHSH)
@echo "release version: "$(RTAG)
@echo "module version: "$(CTAG)
#-----------------------------------------------------------------------------#
# Docker
docker-build:
docker-build :
docker build ./ --tag imctermite:0.1
docker-run:
@ -86,29 +83,28 @@ docker-run:
#-----------------------------------------------------------------------------#
# python
python-build: check-tags $(GVSN)
make -C python/ build-inplace
cp python/imctermite*.so ./ -v
cython-build : check-vtag $(CTAG) $(CYT)setup.py $(CYT)imc_termite.pxd $(CYT)py_imc_termite.pyx $(HPP)
python3 $(CYT)setup.py build_ext --inplace
cp -v imc_termite.cpython-*.so $(PYT)
python-clean:
make -C python/ clean
rm -vf imctermite*.so
cython-install : check-vtag $(CTAG) $(CYT)setup.py $(CYT)imc_termite.pxd $(CYT)py_imc_termite.pyx $(HPP)
python3 $(CYT)setup.py install --record files_imctermite.txt
python-test:
PYTHONPATH=./ python python/examples/usage.py
cython-clean :
rm -vf imc_termite.cpython-*.so
rm -vf $(PYT)imc_termite.cpython-*.so
rm -rvf build/
#-----------------------------------------------------------------------------#
# pip
pip-release: check-vtag $(RTAG) cython-build
cd ./pip/ && make publish
#-----------------------------------------------------------------------------#
# clean
clean: cpp-clean python-clean
#-----------------------------------------------------------------------------#
# github actions
github-action-lint: .github/workflows/pypi-deploy.yml
actionlint $<
# for reference, see:
# https://github.com/rhysd/actionlint
clean: cpp-clean cython-clean
cd ./pip/ && make clean
#-----------------------------------------------------------------------------#

View File

@ -0,0 +1,133 @@
#-----------------------------------------------------------------------------#
import argparse
import os
#-----------------------------------------------------------------------------#
parser = argparse.ArgumentParser(description='List all source dependencies')
#parser.add_argument('pathToRepo',type=str,help='path of source repository')
parser.add_argument('mainSource',type=str,help='main source file')
parser.add_argument('depFile',type=str,help='file listing all dependencies')
args = parser.parse_args()
libpaths = ["/home/mario/Desktop/arrow/cpp/src/",
"/home/mario/Desktop/arrow/cpp/thrift_ep-install/include/",
"/home/mario/Desktop/arrow/cpp/boost_ep-prefix/src/boost_ep/"]
#-----------------------------------------------------------------------------#
def find_dependencies(srcfile, recdepth, cdeplist) :
"""
Given a source file and its dependencies in the given repository path
list all further dependencies recursively
Args:
srcfile (string): path/name of source file
recdepth (integer): current recursion depth
cdeplist (list): current list of dependencies
Return:
deps (list): list of source files in repository, the source file depends on
"""
# define indentation to visual recursion
indent = recdepth*(" ")
print("\n" + indent + "find_dependencies:"
+ "\n" + indent + "1: " + srcfile
+ "\n" + indent + "2: " + str(recdepth)
+ "\n" + indent + "3: " + str(len(cdeplist)) + "\n")
# show dependencies so far
#print(cdeplist)
# generate dependencies by means of g++
libdeps = (" -I ").join(libpaths)
cmd = "g++ -c -MMD " + srcfile + " -I " + libdeps
print(indent + cmd )
os.system(cmd)
# open dependency file and extract list of sources
basename = srcfile.split('/')[-1].split('.')[0]
depfile = basename + '.d'
print(indent + "reading dependency file " + depfile)
with open(depfile,'r') as fin :
depslist = fin.readlines()
# delete dependencies and object files
os.system("rm " + basename + ".d")
os.system("rm " + basename + ".o")
# remove first line
depslist = depslist[1:]
# delete leading space and trailing backslash
depslistcl = [dep.lstrip(' ').rstrip(' \\\n') for dep in depslist]
# collect dependencies
newdeps = []
# check all dependencies recursively and collect further dependencies
count = 0
for dep in depslistcl :
# append source itself to list
if dep not in cdeplist :
print(indent + "adding dependency " + dep)
newdeps.append(dep)
count = count + 1
print(indent + "=> added " + str(count) + "/" + str(len(depslistcl)) )
# check recursion depth
if recdepth < 20 :
# check all dependencies of every single dependency
for dep in depslistcl :
# try to find corresponding *.cc, (*.cpp) file
depcc = dep.split('.')[0] + '.cc'
print(indent + "checking for " + depcc)
if os.path.exists(depcc) :
if depcc not in cdeplist and depcc not in newdeps :
# add file itself as dependency
newdeps.append(depcc)
# find dependencies of single source
newrecdeps = find_dependencies(depcc,recdepth+1,cdeplist+newdeps)
# append to list
for el in newrecdeps :
if el not in newdeps :
newdeps.append(el)
else :
print(indent + "already in list")
else :
print(indent + "does not exist")
print("\n")
# provide list of dependencies
return newdeps
#-----------------------------------------------------------------------------#
if __name__== "__main__":
print("\nCLI arguments:\n" + str(args) + "\n")
# collect list of dependencies
deps = []
# start recursion with given source file
deps = find_dependencies(args.mainSource,0,[])
print("\nfinal list of dependencies: (" + str(len(deps)) + ")\n")
print(deps)
print("\n")
# remove any duplicates
depsuni = set(deps)
print("\nfinal set of dependencies: (" + str(len(depsuni)) + ")\n")
print(depsuni)
print("\n")
# write list of dependencies
with open(args.depFile,'w') as fout :
for el in depsuni :
fout.write(str(el) + '\n')
#-----------------------------------------------------------------------------#

View File

@ -0,0 +1,22 @@
#-----------------------------------------------------------------------------#
from pathlib import Path
# find source files
srcpaths = Path("src/").rglob('*.cc')
deps =[ str(path) for path in srcpaths ]
print(deps)
with open('makefileobj','w') as fout :
for el in deps :
basnam = el.split('/')[-1]
print(str(el) + " : " + str(basnam) + " : " + str(basnam.split('.')[1]))
if basnam.split('.')[1] == 'cc' :
objfile = 'bin/' + basnam.replace('.cc','.o')
fout.write(objfile + " : " + el + "\n")
fout.write("\t" + "$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@\n")
fout.write("\n")
#-----------------------------------------------------------------------------#

356
parquet/parquet/makefile Normal file
View File

@ -0,0 +1,356 @@
#-----------------------------------------------------------------------------#
CPP := g++ -std=c++14
CPPFLAGS := -Woverflow -Wpedantic -Wextra -Waddress -Waligned-new -Walloc-zero
SRC := src/
BIN := bin/
LIBS := -I src/src/ -I src/thrift_ep-install/include/ -I src/boost_ep-prefix/src/boost_ep/
#-----------------------------------------------------------------------------#
# prepare source
#
# before: $ cd arrow/cpp/ and compile relevant sources by
# $ cmake . -D ARROW_PARQUET=ON -D PARQUET_BUILD_EXAMPLES=ON -D ARROW_WITH_SNAPPY=ON
# $ cmake .. -D ARROW_PARQUET=ON ARROW_BUILD_EXAMPLES=ON
lib :
cmake . -D ARROW_WITH_BROTLI=ON -D ARROW_WITH_BZ2=ON -D ARROW_WITH_LZ4=ON -D ARROW_WITH_SNAPPY=ON -D ARROW_WITH_ZLIB=ON -D ARROW_PARQUET=ON -D ARROW_PYTHON=ON
# cp-src : deps.log
# ./src_copy.sh
deps.log :
python3 generate_deps.py reader-writer.cc $@
SRC := $(shell find $(SRC) -name '*.cc')
# OBJ := $(apprefix obj/, $(SRC:%.cc=%.o))
OBJ := $(addprefix $(BIN),$(notdir $(SRC:%.cc=%.o)))
check :
@echo $(SRC)
@echo $(OBJ)
# vpath %.cc src/
reader-writer-example : reader-writer.cc $(OBJ) bin/utilmemory.o
$(CPP) $(CPPFLAGS) $< $(LIBS) -o $@ $(OBJ) bin/utilmemory.o
# $(OBJ) : $(SRC)
# $(CPP) $(OPT) -c $< -o $@ -I src/src/
#
# $(BIN)%.o : $(SRC)
# $(CPP) $(OPT) -c $< -I src/src/ -o $@
clean-obj :
rm -f $(OBJ)
# => do build with cmake like here
# https://arrow.apache.org/docs/developers/python.html#build-and-test
#-----------------------------------------------------------------------------#
bin/type.o : src/src/arrow/type.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/result.o : src/src/arrow/result.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/builder.o : src/src/arrow/builder.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/tensor.o : src/src/arrow/tensor.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/table.o : src/src/arrow/table.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/extension_type.o : src/src/arrow/extension_type.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/device.o : src/src/arrow/device.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/memory_pool.o : src/src/arrow/memory_pool.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/datum.o : src/src/arrow/datum.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/record_batch.o : src/src/arrow/record_batch.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/compare.o : src/src/arrow/compare.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/visitor.o : src/src/arrow/visitor.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/chunked_array.o : src/src/arrow/chunked_array.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/status.o : src/src/arrow/status.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/pretty_print.o : src/src/arrow/pretty_print.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/sparse_tensor.o : src/src/arrow/sparse_tensor.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/buffer.o : src/src/arrow/buffer.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/scalar.o : src/src/arrow/scalar.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/string.o : src/src/arrow/util/string.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/utilmemory.o : src/src/arrow/util/memory.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/future.o : src/src/arrow/util/future.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/iterator.o : src/src/arrow/util/iterator.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/compression.o : src/src/arrow/util/compression.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/utf8.o : src/src/arrow/util/utf8.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/time.o : src/src/arrow/util/time.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/cpu_info.o : src/src/arrow/util/cpu_info.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/thread_pool.o : src/src/arrow/util/thread_pool.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/bit_util.o : src/src/arrow/util/bit_util.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/logging.o : src/src/arrow/util/logging.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/basic_decimal.o : src/src/arrow/util/basic_decimal.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/decimal.o : src/src/arrow/util/decimal.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/bit_block_counter.o : src/src/arrow/util/bit_block_counter.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/key_value_metadata.o : src/src/arrow/util/key_value_metadata.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/int_util.o : src/src/arrow/util/int_util.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/io_util.o : src/src/arrow/util/io_util.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/bitmap_ops.o : src/src/arrow/util/bitmap_ops.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/bitmap_builders.o : src/src/arrow/util/bitmap_builders.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/bit_run_reader.o : src/src/arrow/util/bit_run_reader.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/value_parsing.o : src/src/arrow/util/value_parsing.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/string_builder.o : src/src/arrow/util/string_builder.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/formatting.o : src/src/arrow/util/formatting.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/array_primitive.o : src/src/arrow/array/array_primitive.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/array_dict.o : src/src/arrow/array/array_dict.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/builder_binary.o : src/src/arrow/array/builder_binary.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/builder_union.o : src/src/arrow/array/builder_union.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/concatenate.o : src/src/arrow/array/concatenate.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/array_nested.o : src/src/arrow/array/array_nested.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/array_decimal.o : src/src/arrow/array/array_decimal.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/builder_primitive.o : src/src/arrow/array/builder_primitive.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/data.o : src/src/arrow/array/data.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/diff.o : src/src/arrow/array/diff.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/builder_nested.o : src/src/arrow/array/builder_nested.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/builder_decimal.o : src/src/arrow/array/builder_decimal.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/builder_dict.o : src/src/arrow/array/builder_dict.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/array_binary.o : src/src/arrow/array/array_binary.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/builder_adaptive.o : src/src/arrow/array/builder_adaptive.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/array_base.o : src/src/arrow/array/array_base.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/validate.o : src/src/arrow/array/validate.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/builder_base.o : src/src/arrow/array/builder_base.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/util.o : src/src/arrow/array/util.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/caching.o : src/src/arrow/io/caching.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/memory.o : src/src/arrow/io/memory.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/interfaces.o : src/src/arrow/io/interfaces.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/buffered.o : src/src/arrow/io/buffered.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/file.o : src/src/arrow/io/file.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/strtod.o : src/src/arrow/vendored/double-conversion/strtod.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/bignum.o : src/src/arrow/vendored/double-conversion/bignum.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/fixed-dtoa.o : src/src/arrow/vendored/double-conversion/fixed-dtoa.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/fast-dtoa.o : src/src/arrow/vendored/double-conversion/fast-dtoa.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/diy-fp.o : src/src/arrow/vendored/double-conversion/diy-fp.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/double-conversion.o : src/src/arrow/vendored/double-conversion/double-conversion.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/bignum-dtoa.o : src/src/arrow/vendored/double-conversion/bignum-dtoa.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/cached-powers.o : src/src/arrow/vendored/double-conversion/cached-powers.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/api_aggregate.o : src/src/arrow/compute/api_aggregate.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/exec.o : src/src/arrow/compute/exec.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/kernel.o : src/src/arrow/compute/kernel.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/registry.o : src/src/arrow/compute/registry.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/function.o : src/src/arrow/compute/function.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/cast.o : src/src/arrow/compute/cast.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/api_vector.o : src/src/arrow/compute/api_vector.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/api_scalar.o : src/src/arrow/compute/api_scalar.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/codegen_internal.o : src/src/arrow/compute/kernels/codegen_internal.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/column_scanner.o : src/src/parquet/column_scanner.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/statistics.o : src/src/parquet/statistics.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/internal_file_decryptor.o : src/src/parquet/internal_file_decryptor.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/column_writer.o : src/src/parquet/column_writer.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/encryption.o : src/src/parquet/encryption.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/file_reader.o : src/src/parquet/file_reader.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/properties.o : src/src/parquet/properties.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/encryption_internal.o : src/src/parquet/encryption_internal.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/internal_file_encryptor.o : src/src/parquet/internal_file_encryptor.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/types.o : src/src/parquet/types.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/encoding.o : src/src/parquet/encoding.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/metadata.o : src/src/parquet/metadata.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/printer.o : src/src/parquet/printer.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/level_conversion.o : src/src/parquet/level_conversion.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/deprecated_io.o : src/src/parquet/deprecated_io.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/file_writer.o : src/src/parquet/file_writer.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/schema.o : src/src/parquet/schema.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/platform.o : src/src/parquet/platform.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/column_reader.o : src/src/parquet/column_reader.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@

View File

@ -0,0 +1,96 @@
#-----------------------------------------------------------------------------#
PARQUETDIR := /home/mario/Desktop/Record_Evolution/parquet-cpp
ARROWDIR := /home/mario/Desktop/Record_Evolution/arrow/cpp/src
CPP := g++ -std=c++14
OPT := -Wall -Woverflow -Wpedantic -Wextra -Waddress -Waligned-new -Walloc-zero
prepare : collect_parquet modify_parquet collect_arrow modify_arrow
collect_parquet :
cp -r $(PARQUETDIR)/src/parquet ./
cp $(PARQUETDIR)/examples/low-level-api/reader_writer.h ./
cp $(PARQUETDIR)/examples/low-level-api/reader-writer.cc ./
modify_parquet :
cp parquet/parquet_version.h.in parquet/parquet_version.h
sed -i 's/ReadableFileInterface/ReadWriteFileInterface/g' parquet/util/memory.h
sed -i 's/ReadableFileInterface/ReadWriteFileInterface/g' parquet/file_reader.h
sed -i 's/arrow::Codec/arrow::util::Codec/g' parquet/util/memory.h
sed -i 's/valid_bits_writer/valid_bits_offset/g' parquet/column_reader.h
collect_arrow :
cp -r $(ARROWDIR)/arrow ./
modify_arrow :
cp arrow/util/bit_util.h arrow/util/bit-util.h
collect_test :
cp $(PARQUETDIR)/examples/low-level-api/reader-writer.cc ./
subst :
sed -i 's/#include \"arrow\//\/\/#include \"arrow/g' parquet/properties.h
test :
$(CPP) $(OPT) -I$(PWD) reader-writer.cc
clean :
rm -r parquet/ arrow/
rm reader-writer.cc reader_writer.h
#-----------------------------------------------------------------------------#
# choose shell
SHELL:=/bin/bash
SRC = reader-writer
# specify path of cloned directory
ARROWGIT := /home/mario/Desktop/Record_Evolution/arrow
filewriter : parquet/file_writer.cc
$(CPP) -c $(OPT) $<
# build executable (and generate dependency file)
readwrite : reader-writer.cc
$(CPP) $(OPT) -MMD $< -I ./
# generate dependency file
$(SRC).d : $(SRC).cc
$(CPP) -c -MMD $< -I ./ -I $(ARROWGIT)/cpp/src/
# extract source dependencies
extract-dep : $(SRC).d
@# extract relevant dependencies
cat $< | sed 's/ /\n/g' | awk 'NF' | grep -v '\\' | grep '\/' > deps.log
cat deps.log | sed ':a;N;$!ba;s/\n/ /g' > headers.log
cat headers.log | sed 's/.h$$/.cc/g' > sources.log
@# copy required sources
mkdir -p temp/
cp --parents `cat headers.log` temp/
cp --parents `cat sources.log` temp/ 2>/dev/null
mv temp$(ARROWGIT)/cpp/src/* ./
rm -r temp
clean-dep :
rm -f deps.log headers.log sources.log $(SRC).d
#-----------------------------------------------------------------------------#
# only use more recent and up to date repository arrow.git
# build arrow shared/static libraries
build :
cd arrow/cpp
# cmake -LA to show all options
cmake . -D ARROW_PARQUET=ON #ARROW_ARMV8_ARCH=armv8-a
make
example :
cd arrow/cpp/examples/parquet/low-level-api/
g++ reader-writer.cc -I. -I../../../src/ -L../../../../cpp/build/release/ -larrow -lparquet
# set environment variable LD_LIBRARY_PATH=../../../../cpp/build/release/ before launching executable
#------------------------------------------------------------------------------------#

303
parquet/parquet/makefileobj Normal file
View File

@ -0,0 +1,303 @@
bin/type.o : src/src/arrow/type.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/result.o : src/src/arrow/result.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/builder.o : src/src/arrow/builder.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/tensor.o : src/src/arrow/tensor.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/table.o : src/src/arrow/table.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/extension_type.o : src/src/arrow/extension_type.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/device.o : src/src/arrow/device.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/memory_pool.o : src/src/arrow/memory_pool.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/datum.o : src/src/arrow/datum.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/record_batch.o : src/src/arrow/record_batch.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/compare.o : src/src/arrow/compare.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/visitor.o : src/src/arrow/visitor.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/chunked_array.o : src/src/arrow/chunked_array.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/status.o : src/src/arrow/status.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/pretty_print.o : src/src/arrow/pretty_print.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/sparse_tensor.o : src/src/arrow/sparse_tensor.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/buffer.o : src/src/arrow/buffer.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/scalar.o : src/src/arrow/scalar.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/string.o : src/src/arrow/util/string.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/memory.o : src/src/arrow/util/memory.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/future.o : src/src/arrow/util/future.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/iterator.o : src/src/arrow/util/iterator.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/compression.o : src/src/arrow/util/compression.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/utf8.o : src/src/arrow/util/utf8.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/time.o : src/src/arrow/util/time.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/cpu_info.o : src/src/arrow/util/cpu_info.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/thread_pool.o : src/src/arrow/util/thread_pool.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/bit_util.o : src/src/arrow/util/bit_util.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/logging.o : src/src/arrow/util/logging.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/basic_decimal.o : src/src/arrow/util/basic_decimal.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/decimal.o : src/src/arrow/util/decimal.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/bit_block_counter.o : src/src/arrow/util/bit_block_counter.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/key_value_metadata.o : src/src/arrow/util/key_value_metadata.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/int_util.o : src/src/arrow/util/int_util.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/io_util.o : src/src/arrow/util/io_util.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/bitmap_ops.o : src/src/arrow/util/bitmap_ops.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/bitmap_builders.o : src/src/arrow/util/bitmap_builders.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/bit_run_reader.o : src/src/arrow/util/bit_run_reader.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/value_parsing.o : src/src/arrow/util/value_parsing.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/string_builder.o : src/src/arrow/util/string_builder.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/formatting.o : src/src/arrow/util/formatting.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/array_primitive.o : src/src/arrow/array/array_primitive.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/array_dict.o : src/src/arrow/array/array_dict.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/builder_binary.o : src/src/arrow/array/builder_binary.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/builder_union.o : src/src/arrow/array/builder_union.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/concatenate.o : src/src/arrow/array/concatenate.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/array_nested.o : src/src/arrow/array/array_nested.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/array_decimal.o : src/src/arrow/array/array_decimal.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/builder_primitive.o : src/src/arrow/array/builder_primitive.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/data.o : src/src/arrow/array/data.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/diff.o : src/src/arrow/array/diff.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/builder_nested.o : src/src/arrow/array/builder_nested.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/builder_decimal.o : src/src/arrow/array/builder_decimal.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/builder_dict.o : src/src/arrow/array/builder_dict.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/array_binary.o : src/src/arrow/array/array_binary.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/builder_adaptive.o : src/src/arrow/array/builder_adaptive.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/array_base.o : src/src/arrow/array/array_base.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/validate.o : src/src/arrow/array/validate.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/builder_base.o : src/src/arrow/array/builder_base.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/util.o : src/src/arrow/array/util.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/caching.o : src/src/arrow/io/caching.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/memory.o : src/src/arrow/io/memory.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/interfaces.o : src/src/arrow/io/interfaces.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/buffered.o : src/src/arrow/io/buffered.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/file.o : src/src/arrow/io/file.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/strtod.o : src/src/arrow/vendored/double-conversion/strtod.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/bignum.o : src/src/arrow/vendored/double-conversion/bignum.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/fixed-dtoa.o : src/src/arrow/vendored/double-conversion/fixed-dtoa.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/fast-dtoa.o : src/src/arrow/vendored/double-conversion/fast-dtoa.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/diy-fp.o : src/src/arrow/vendored/double-conversion/diy-fp.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/double-conversion.o : src/src/arrow/vendored/double-conversion/double-conversion.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/bignum-dtoa.o : src/src/arrow/vendored/double-conversion/bignum-dtoa.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/cached-powers.o : src/src/arrow/vendored/double-conversion/cached-powers.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/api_aggregate.o : src/src/arrow/compute/api_aggregate.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/exec.o : src/src/arrow/compute/exec.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/kernel.o : src/src/arrow/compute/kernel.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/registry.o : src/src/arrow/compute/registry.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/function.o : src/src/arrow/compute/function.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/cast.o : src/src/arrow/compute/cast.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/api_vector.o : src/src/arrow/compute/api_vector.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/api_scalar.o : src/src/arrow/compute/api_scalar.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/codegen_internal.o : src/src/arrow/compute/kernels/codegen_internal.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/column_scanner.o : src/src/parquet/column_scanner.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/statistics.o : src/src/parquet/statistics.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/internal_file_decryptor.o : src/src/parquet/internal_file_decryptor.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/column_writer.o : src/src/parquet/column_writer.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/encryption.o : src/src/parquet/encryption.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/file_reader.o : src/src/parquet/file_reader.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/properties.o : src/src/parquet/properties.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/encryption_internal.o : src/src/parquet/encryption_internal.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/internal_file_encryptor.o : src/src/parquet/internal_file_encryptor.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/types.o : src/src/parquet/types.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/encoding.o : src/src/parquet/encoding.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/metadata.o : src/src/parquet/metadata.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/printer.o : src/src/parquet/printer.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/level_conversion.o : src/src/parquet/level_conversion.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/deprecated_io.o : src/src/parquet/deprecated_io.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/file_writer.o : src/src/parquet/file_writer.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/schema.o : src/src/parquet/schema.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/platform.o : src/src/parquet/platform.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@
bin/column_reader.o : src/src/parquet/column_reader.cc
$(CPP) $(CPPFLAGS) -c $< $(LIBS) -o $@

View File

@ -0,0 +1,413 @@
// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
// regarding copyright ownership. The ASF licenses this file
// to you under the Apache License, Version 2.0 (the
// "License"); you may not use this file except in compliance
// with the License. You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,
// software distributed under the License is distributed on an
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.
#include <cassert>
#include <fstream>
#include <iostream>
#include <memory>
#include "reader_writer.h"
/*
* This example describes writing and reading Parquet Files in C++ and serves as a
* reference to the API.
* The file contains all the physical data types supported by Parquet.
* This example uses the RowGroupWriter API that supports writing RowGroups optimized for
*memory consumption
**/
/* Parquet is a structured columnar file format
* Parquet File = "Parquet data" + "Parquet Metadata"
* "Parquet data" is simply a vector of RowGroups. Each RowGroup is a batch of rows in a
* columnar layout
* "Parquet Metadata" contains the "file schema" and attributes of the RowGroups and their
* Columns
* "file schema" is a tree where each node is either a primitive type (leaf nodes) or a
* complex (nested) type (internal nodes)
* For specific details, please refer the format here:
* https://github.com/apache/parquet-format/blob/master/LogicalTypes.md
**/
constexpr int NUM_ROWS_PER_ROW_GROUP = 500;
const char PARQUET_FILENAME[] = "parquet_cpp_example.parquet";
int main(int argc, char** argv) {
/**********************************************************************************
PARQUET WRITER EXAMPLE
**********************************************************************************/
// parquet::REQUIRED fields do not need definition and repetition level values
// parquet::OPTIONAL fields require only definition level values
// parquet::REPEATED fields require both definition and repetition level values
try {
// Create a local file output stream instance.
using FileClass = ::arrow::io::FileOutputStream;
std::shared_ptr<FileClass> out_file;
PARQUET_ASSIGN_OR_THROW(out_file, FileClass::Open(PARQUET_FILENAME));
// Setup the parquet schema
std::shared_ptr<GroupNode> schema = SetupSchema();
// Add writer properties
parquet::WriterProperties::Builder builder;
builder.compression(parquet::Compression::UNCOMPRESSED);
std::shared_ptr<parquet::WriterProperties> props = builder.build();
// Create a ParquetFileWriter instance
std::shared_ptr<parquet::ParquetFileWriter> file_writer =
parquet::ParquetFileWriter::Open(out_file, schema, props);
// Append a RowGroup with a specific number of rows.
parquet::RowGroupWriter* rg_writer = file_writer->AppendRowGroup();
// Write the Bool column
parquet::BoolWriter* bool_writer =
static_cast<parquet::BoolWriter*>(rg_writer->NextColumn());
for (int i = 0; i < NUM_ROWS_PER_ROW_GROUP; i++) {
bool value = ((i % 2) == 0) ? true : false;
bool_writer->WriteBatch(1, nullptr, nullptr, &value);
}
// Write the Int32 column
parquet::Int32Writer* int32_writer =
static_cast<parquet::Int32Writer*>(rg_writer->NextColumn());
for (int i = 0; i < NUM_ROWS_PER_ROW_GROUP; i++) {
int32_t value = i;
int32_writer->WriteBatch(1, nullptr, nullptr, &value);
}
// Write the Int64 column. Each row has repeats twice.
parquet::Int64Writer* int64_writer =
static_cast<parquet::Int64Writer*>(rg_writer->NextColumn());
for (int i = 0; i < 2 * NUM_ROWS_PER_ROW_GROUP; i++) {
int64_t value = i * 1000 * 1000;
value *= 1000 * 1000;
int16_t definition_level = 1;
int16_t repetition_level = 0;
if ((i % 2) == 0) {
repetition_level = 1; // start of a new record
}
int64_writer->WriteBatch(1, &definition_level, &repetition_level, &value);
}
// Write the INT96 column.
parquet::Int96Writer* int96_writer =
static_cast<parquet::Int96Writer*>(rg_writer->NextColumn());
for (int i = 0; i < NUM_ROWS_PER_ROW_GROUP; i++) {
parquet::Int96 value;
value.value[0] = i;
value.value[1] = i + 1;
value.value[2] = i + 2;
int96_writer->WriteBatch(1, nullptr, nullptr, &value);
}
// Write the Float column
parquet::FloatWriter* float_writer =
static_cast<parquet::FloatWriter*>(rg_writer->NextColumn());
for (int i = 0; i < NUM_ROWS_PER_ROW_GROUP; i++) {
float value = static_cast<float>(i) * 1.1f;
float_writer->WriteBatch(1, nullptr, nullptr, &value);
}
// Write the Double column
parquet::DoubleWriter* double_writer =
static_cast<parquet::DoubleWriter*>(rg_writer->NextColumn());
for (int i = 0; i < NUM_ROWS_PER_ROW_GROUP; i++) {
double value = i * 1.1111111;
double_writer->WriteBatch(1, nullptr, nullptr, &value);
}
// Write the ByteArray column. Make every alternate values NULL
parquet::ByteArrayWriter* ba_writer =
static_cast<parquet::ByteArrayWriter*>(rg_writer->NextColumn());
for (int i = 0; i < NUM_ROWS_PER_ROW_GROUP; i++) {
parquet::ByteArray value;
char hello[FIXED_LENGTH] = "parquet";
hello[7] = static_cast<char>(static_cast<int>('0') + i / 100);
hello[8] = static_cast<char>(static_cast<int>('0') + (i / 10) % 10);
hello[9] = static_cast<char>(static_cast<int>('0') + i % 10);
if (i % 2 == 0) {
int16_t definition_level = 1;
value.ptr = reinterpret_cast<const uint8_t*>(&hello[0]);
value.len = FIXED_LENGTH;
ba_writer->WriteBatch(1, &definition_level, nullptr, &value);
} else {
int16_t definition_level = 0;
ba_writer->WriteBatch(1, &definition_level, nullptr, nullptr);
}
}
// Write the FixedLengthByteArray column
parquet::FixedLenByteArrayWriter* flba_writer =
static_cast<parquet::FixedLenByteArrayWriter*>(rg_writer->NextColumn());
for (int i = 0; i < NUM_ROWS_PER_ROW_GROUP; i++) {
parquet::FixedLenByteArray value;
char v = static_cast<char>(i);
char flba[FIXED_LENGTH] = {v, v, v, v, v, v, v, v, v, v};
value.ptr = reinterpret_cast<const uint8_t*>(&flba[0]);
flba_writer->WriteBatch(1, nullptr, nullptr, &value);
}
// Close the ParquetFileWriter
file_writer->Close();
// Write the bytes to file
DCHECK(out_file->Close().ok());
} catch (const std::exception& e) {
std::cerr << "Parquet write error: " << e.what() << std::endl;
return -1;
}
/**********************************************************************************
PARQUET READER EXAMPLE
**********************************************************************************/
try {
// Create a ParquetReader instance
std::unique_ptr<parquet::ParquetFileReader> parquet_reader =
parquet::ParquetFileReader::OpenFile(PARQUET_FILENAME, false);
// Get the File MetaData
std::shared_ptr<parquet::FileMetaData> file_metadata = parquet_reader->metadata();
// Get the number of RowGroups
int num_row_groups = file_metadata->num_row_groups();
assert(num_row_groups == 1);
// Get the number of Columns
int num_columns = file_metadata->num_columns();
assert(num_columns == 8);
// Iterate over all the RowGroups in the file
for (int r = 0; r < num_row_groups; ++r) {
// Get the RowGroup Reader
std::shared_ptr<parquet::RowGroupReader> row_group_reader =
parquet_reader->RowGroup(r);
int64_t values_read = 0;
int64_t rows_read = 0;
int16_t definition_level;
int16_t repetition_level;
int i;
std::shared_ptr<parquet::ColumnReader> column_reader;
ARROW_UNUSED(rows_read); // prevent warning in release build
// Get the Column Reader for the boolean column
column_reader = row_group_reader->Column(0);
parquet::BoolReader* bool_reader =
static_cast<parquet::BoolReader*>(column_reader.get());
// Read all the rows in the column
i = 0;
while (bool_reader->HasNext()) {
bool value;
// Read one value at a time. The number of rows read is returned. values_read
// contains the number of non-null rows
rows_read = bool_reader->ReadBatch(1, nullptr, nullptr, &value, &values_read);
// Ensure only one value is read
assert(rows_read == 1);
// There are no NULL values in the rows written
assert(values_read == 1);
// Verify the value written
bool expected_value = ((i % 2) == 0) ? true : false;
assert(value == expected_value);
i++;
}
// Get the Column Reader for the Int32 column
column_reader = row_group_reader->Column(1);
parquet::Int32Reader* int32_reader =
static_cast<parquet::Int32Reader*>(column_reader.get());
// Read all the rows in the column
i = 0;
while (int32_reader->HasNext()) {
int32_t value;
// Read one value at a time. The number of rows read is returned. values_read
// contains the number of non-null rows
rows_read = int32_reader->ReadBatch(1, nullptr, nullptr, &value, &values_read);
// Ensure only one value is read
assert(rows_read == 1);
// There are no NULL values in the rows written
assert(values_read == 1);
// Verify the value written
assert(value == i);
i++;
}
// Get the Column Reader for the Int64 column
column_reader = row_group_reader->Column(2);
parquet::Int64Reader* int64_reader =
static_cast<parquet::Int64Reader*>(column_reader.get());
// Read all the rows in the column
i = 0;
while (int64_reader->HasNext()) {
int64_t value;
// Read one value at a time. The number of rows read is returned. values_read
// contains the number of non-null rows
rows_read = int64_reader->ReadBatch(1, &definition_level, &repetition_level,
&value, &values_read);
// Ensure only one value is read
assert(rows_read == 1);
// There are no NULL values in the rows written
assert(values_read == 1);
// Verify the value written
int64_t expected_value = i * 1000 * 1000;
expected_value *= 1000 * 1000;
assert(value == expected_value);
if ((i % 2) == 0) {
assert(repetition_level == 1);
} else {
assert(repetition_level == 0);
}
i++;
}
// Get the Column Reader for the Int96 column
column_reader = row_group_reader->Column(3);
parquet::Int96Reader* int96_reader =
static_cast<parquet::Int96Reader*>(column_reader.get());
// Read all the rows in the column
i = 0;
while (int96_reader->HasNext()) {
parquet::Int96 value;
// Read one value at a time. The number of rows read is returned. values_read
// contains the number of non-null rows
rows_read = int96_reader->ReadBatch(1, nullptr, nullptr, &value, &values_read);
// Ensure only one value is read
assert(rows_read == 1);
// There are no NULL values in the rows written
assert(values_read == 1);
// Verify the value written
parquet::Int96 expected_value;
ARROW_UNUSED(expected_value); // prevent warning in release build
expected_value.value[0] = i;
expected_value.value[1] = i + 1;
expected_value.value[2] = i + 2;
for (int j = 0; j < 3; j++) {
assert(value.value[j] == expected_value.value[j]);
}
i++;
}
// Get the Column Reader for the Float column
column_reader = row_group_reader->Column(4);
parquet::FloatReader* float_reader =
static_cast<parquet::FloatReader*>(column_reader.get());
// Read all the rows in the column
i = 0;
while (float_reader->HasNext()) {
float value;
// Read one value at a time. The number of rows read is returned. values_read
// contains the number of non-null rows
rows_read = float_reader->ReadBatch(1, nullptr, nullptr, &value, &values_read);
// Ensure only one value is read
assert(rows_read == 1);
// There are no NULL values in the rows written
assert(values_read == 1);
// Verify the value written
float expected_value = static_cast<float>(i) * 1.1f;
assert(value == expected_value);
i++;
}
// Get the Column Reader for the Double column
column_reader = row_group_reader->Column(5);
parquet::DoubleReader* double_reader =
static_cast<parquet::DoubleReader*>(column_reader.get());
// Read all the rows in the column
i = 0;
while (double_reader->HasNext()) {
double value;
// Read one value at a time. The number of rows read is returned. values_read
// contains the number of non-null rows
rows_read = double_reader->ReadBatch(1, nullptr, nullptr, &value, &values_read);
// Ensure only one value is read
assert(rows_read == 1);
// There are no NULL values in the rows written
assert(values_read == 1);
// Verify the value written
double expected_value = i * 1.1111111;
assert(value == expected_value);
i++;
}
// Get the Column Reader for the ByteArray column
column_reader = row_group_reader->Column(6);
parquet::ByteArrayReader* ba_reader =
static_cast<parquet::ByteArrayReader*>(column_reader.get());
// Read all the rows in the column
i = 0;
while (ba_reader->HasNext()) {
parquet::ByteArray value;
// Read one value at a time. The number of rows read is returned. values_read
// contains the number of non-null rows
rows_read =
ba_reader->ReadBatch(1, &definition_level, nullptr, &value, &values_read);
// Ensure only one value is read
assert(rows_read == 1);
// Verify the value written
char expected_value[FIXED_LENGTH] = "parquet";
ARROW_UNUSED(expected_value); // prevent warning in release build
expected_value[7] = static_cast<char>('0' + i / 100);
expected_value[8] = static_cast<char>('0' + (i / 10) % 10);
expected_value[9] = static_cast<char>('0' + i % 10);
if (i % 2 == 0) { // only alternate values exist
// There are no NULL values in the rows written
assert(values_read == 1);
assert(value.len == FIXED_LENGTH);
assert(memcmp(value.ptr, &expected_value[0], FIXED_LENGTH) == 0);
assert(definition_level == 1);
} else {
// There are NULL values in the rows written
assert(values_read == 0);
assert(definition_level == 0);
}
i++;
}
// Get the Column Reader for the FixedLengthByteArray column
column_reader = row_group_reader->Column(7);
parquet::FixedLenByteArrayReader* flba_reader =
static_cast<parquet::FixedLenByteArrayReader*>(column_reader.get());
// Read all the rows in the column
i = 0;
while (flba_reader->HasNext()) {
parquet::FixedLenByteArray value;
// Read one value at a time. The number of rows read is returned. values_read
// contains the number of non-null rows
rows_read = flba_reader->ReadBatch(1, nullptr, nullptr, &value, &values_read);
// Ensure only one value is read
assert(rows_read == 1);
// There are no NULL values in the rows written
assert(values_read == 1);
// Verify the value written
char v = static_cast<char>(i);
char expected_value[FIXED_LENGTH] = {v, v, v, v, v, v, v, v, v, v};
assert(memcmp(value.ptr, &expected_value[0], FIXED_LENGTH) == 0);
i++;
}
}
} catch (const std::exception& e) {
std::cerr << "Parquet read error: " << e.what() << std::endl;
return -1;
}
std::cout << "Parquet Writing and Reading Complete" << std::endl;
return 0;
}

View File

@ -0,0 +1,71 @@
// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
// regarding copyright ownership. The ASF licenses this file
// to you under the Apache License, Version 2.0 (the
// "License"); you may not use this file except in compliance
// with the License. You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,
// software distributed under the License is distributed on an
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.
#include <arrow/io/file.h>
#include <arrow/util/logging.h>
#include <parquet/api/reader.h>
#include <parquet/api/writer.h>
using parquet::ConvertedType;
using parquet::Repetition;
using parquet::Type;
using parquet::schema::GroupNode;
using parquet::schema::PrimitiveNode;
constexpr int FIXED_LENGTH = 10;
static std::shared_ptr<GroupNode> SetupSchema() {
parquet::schema::NodeVector fields;
// Create a primitive node named 'boolean_field' with type:BOOLEAN,
// repetition:REQUIRED
fields.push_back(PrimitiveNode::Make("boolean_field", Repetition::REQUIRED,
Type::BOOLEAN, ConvertedType::NONE));
// Create a primitive node named 'int32_field' with type:INT32, repetition:REQUIRED,
// logical type:TIME_MILLIS
fields.push_back(PrimitiveNode::Make("int32_field", Repetition::REQUIRED, Type::INT32,
ConvertedType::TIME_MILLIS));
// Create a primitive node named 'int64_field' with type:INT64, repetition:REPEATED
fields.push_back(PrimitiveNode::Make("int64_field", Repetition::REPEATED, Type::INT64,
ConvertedType::NONE));
fields.push_back(PrimitiveNode::Make("int96_field", Repetition::REQUIRED, Type::INT96,
ConvertedType::NONE));
fields.push_back(PrimitiveNode::Make("float_field", Repetition::REQUIRED, Type::FLOAT,
ConvertedType::NONE));
fields.push_back(PrimitiveNode::Make("double_field", Repetition::REQUIRED, Type::DOUBLE,
ConvertedType::NONE));
// Create a primitive node named 'ba_field' with type:BYTE_ARRAY, repetition:OPTIONAL
fields.push_back(PrimitiveNode::Make("ba_field", Repetition::OPTIONAL, Type::BYTE_ARRAY,
ConvertedType::NONE));
// Create a primitive node named 'flba_field' with type:FIXED_LEN_BYTE_ARRAY,
// repetition:REQUIRED, field_length = FIXED_LENGTH
fields.push_back(PrimitiveNode::Make("flba_field", Repetition::REQUIRED,
Type::FIXED_LEN_BYTE_ARRAY, ConvertedType::NONE,
FIXED_LENGTH));
// Create a GroupNode named 'schema' using the primitive nodes defined above
// This GroupNode is the root node of the schema tree
return std::static_pointer_cast<GroupNode>(
GroupNode::Make("schema", Repetition::REQUIRED, fields));
}

6
parquet/parquet/src_copy.sh Executable file
View File

@ -0,0 +1,6 @@
#!/bin/bash
mkdir src
cat deps.log | while read f; do cp --parents $f src/; done;
mv src/home/mario/Desktop/arrow/cpp/* src/
rm -r src/home/

171
parquet/parquet/src_setup.sh Executable file
View File

@ -0,0 +1,171 @@
#!/bin/bash
#-----------------------------------------------------------------------------#
# NOTE: before starting to extract the minimal required sources and dependencies
# run
# $ cd cpp/
# $ cmake -D ARROW_PARQUET=ON
# in the arrow repository
# provide
# - local path of clone of https://github.com/apache/arrow.git
# - name/path of main .hpp file of cython extension
repo="$1"
main="$2"
depf="$3"
# check CLI arguments
if [ -z "$repo" ] || [ -z "$main" ] || [ -z "$depf" ]; then
echo "please provide..."
echo "1. local path of arrow repository"
echo "2. name of main .hpp/.cpp"
echo "3. desired name of dependency file"
echo -e "example:\n./setup-sources.sh /home/mario/Desktop/Record_Evolution/arrow/ reader-writer.cc deps.log"
exit 1
fi
echo -e "extracting sources from/for \n1: ${repo}\n2: ${main}\n3: ${depf}\n"
# make sure the dependency file is empty
rm -f ${depf}
touch ${depf}
# define maximal recursion depth
maxdep=8
#-----------------------------------------------------------------------------#
# define function to list dependencies of source file in repository recursively
listDependencies()
{
rep="$1"
src="$2"
dep="$3"
rec="$4"
echo -e "\nstarting 'listDependencies()' for\n1. ${rep}\n2. ${src}\n3. ${dep}\n4. ${rec}"
# generate dependency file (and remove resulting object file)
echo -e "g++ -c -MMD ${src} -I ${rep}cpp/src/\n"
g++ -c -MMD ${src} -I ${rep}cpp/src/
# derive name of dependency and object files
depf=$(basename ${src} | sed 's/.cc/.d/g')
objf=$(basename ${src} | sed 's/.cc/.o/g')
rm ${objf}
# list dependencies by
# 1. removing header
# 2. remove source itself
# 3. delete leading spaces
# 4. delete trailing backslashs
# 5. remove empty lines
cat ${depf} | grep ${rep} | grep -v ${src} | tr -d "^ " | tr -d "\\\\" | awk 'NF' > listdep.log
# rm ${depf}
while IFS= read -r fs
do
echo "$fs"
# check if dependency is already in the list
if grep -Fxq "$fs" "$dep"
then
echo "dep exist"
else
echo "dep does not exist yet => adding it"
# add dependency to list
echo "$fs" >> ${dep}
# check for corresponding source file
fssourc=$(echo ${fs} | sed 's/.h$/.cc/g' | sed 's/.hpp$/.cpp/g')
echo ${fssourc}
if [ -f "$fssourc" ]
then
echo "source file exists"
# list nested dependencies
if [ "$rec" -lt "$maxdep" ]
then
# increment recursion depth
recinc=$(($rec+1))
# call recursion
listDependencies ${rep} ${fssourc} ${dep} ${recinc}
else
echo "maximal recursion depth exceeded"
fi
else
echo "source file does not exist"
fi
fi
echo ""
done < listdep.log
# cat listdep.log | while read fs
# do
# echo $fs
# # check if dependency is already in the list
# inlist=$(cat listdep.log | grep ${fs} | wc -l)
# echo ${inlist}
# # check for any corresponding source files
# # if [ -f ]
# done
}
#-----------------------------------------------------------------------------#
# call function to list dependencies (recursively)
listDependencies ${repo} ${main} ${depf} 0
# # generate dependency file (and remove resulting object file)
# echo -e "generate dependencies:\ng++ -c -MMD ${main} -I ./ -I ${repo}cpp/src/\n"
# g++ -c -MMD ${main} -I ${repo}cpp/src/
# rm $(echo ${main} | sed 's/.cc/.o/g')
#
# # derive name of dependency file
# dep=$(echo ${main} | sed 's/.cc/.d/g')
#
# if [ -f "$dep" ]; then
#
# # list dependencies
# cat ${dep} | sed 's/ /\n/g' | awk 'NF' | grep -v '\\' | grep '\/' > deps.log
#
# # extract list of headers
# cat deps.log | sed ':a;N;$!ba;s/\n/ /g' > deps-headers.log
# echo "list of required headers ($(cat deps.log | wc -l))"
# cat deps-headers.log
# echo ""
#
# # imply list of sources
# cat deps.log | sed 's/.h$/.cc/g' | sed 's/.hpp$/.cpp/g' > sources_raw.log
# cat sources_raw.log | while read f
# do
# if [ -f "$f" ]; then
# echo $f >> sources_check.log
# fi
# done
# cat sources_check.log | sed ':a;N;$!ba;s/\n/ /g' > deps-sources.log
# echo "list of required sources ($(cat sources_check.log | wc -l))"
# cat deps-sources.log
# echo ""
#
# # remove all temporary files
# rm ${dep} deps.log
# rm sources_raw.log sources_check.log
#
# # copy required headers and sources
# echo -e "copy required headers and sources"
# mkdir temp/
# cp --parents `cat deps-headers.log` temp/
# cp --parents `cat deps-sources.log` temp/
# mv temp${repo}cpp/src/* ./
# rm -r temp
#
# # remove dependencies
# #rm deps-headers.log deps-sources.log
#
# # show files
# ls -lh
#
# else
#
# echo -e "\nERROR: failed to generate dependency file\n"
#
# fi

View File

@ -0,0 +1,26 @@
FROM ubuntu:19.10
RUN apt-get update -y && apt-get install -y \
apt-utils \
git g++ \
make cmake \
pkg-config \
#build-essentials \
python3 \
python3-setuptools \
cython3 \
python3-numpy
RUN git clone https://github.com/apache/arrow.git --single-branch --depth=1
COPY . ./
RUN chmod u+x ./build_arrow_cpp.sh
RUN chmod u+x ./build_arrow_python.sh
RUN ./build_arrow_cpp.sh
RUN ./build_arrow_python.sh
#RUN chmod u+x ./build_arrow.sh
#CMD ["./build_arrow.sh"]
CMD ["sleep 1d"]

View File

@ -0,0 +1,5 @@
#!/bin/bash
docker build . --tag=pyarrowbuild:latest
docker run -it pyarrowbuild:latest /bin/bash

View File

@ -0,0 +1,65 @@
#!/bin/bash
sleep infinity
startts=$(date)
echo "starting build process at ${startts}..."
echo -e "\nhome directory is..."
pwd
echo -e "\ncloning apache/arrow..."
git clone https://github.com/apache/arrow.git --single-branch --depth=1
echo -e "\nls -lh /\n"
ls -lh /
echo -e "\nls -lh arrow/\n"
ls -lh arrow/
echo -e "\nls -lh arrow/python/\n"
ls -lh arrow/python
mkdir arrow/cpp/build
pushd arrow/cpp/build
cmake -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
-DCMAKE_INSTALL_LIBDIR=lib \
-DARROW_WITH_BZ2=ON \
-DARROW_WITH_ZLIB=ON \
-DARROW_WITH_ZSTD=ON \
-DARROW_WITH_LZ4=ON \
-DARROW_WITH_SNAPPY=ON \
-DARROW_WITH_BROTLI=ON \
-DARROW_PARQUET=ON \
-DARROW_PYTHON=ON \
-DARROW_BUILD_TESTS=OFF \
-DARROW_WITH_HDFS=OFF \
..
make -j4
make install
popd
#cython --version
cython3 --version
pushd arrow/python
export ARROW_LIB_DIR=/lib/
export PYARROW_WITH_PARQUET=1
export PYARROW_WITH_CUDA=0
export PYARROW_WITH_FlIGHT=0
export PYARROW_WITH_DATASET=0
export PYARROW_WITH_ORC=0
export PYARROW_WITH_PLASMA=0
export PYARROW_WITH_S3FS=0
export PYARROW_WITH_HDFS=0
export PYARROW_WITH_GANDIVA=0
python3 setup.py build_ext --inplace
popd
echo " started build process at ${startts} ..."
finishts=$(date)
echo "finishing build process at ${finishts}..."

View File

@ -0,0 +1,23 @@
#!/bin/bash
mkdir arrow/cpp/build
pushd arrow/cpp/build
cmake -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
-DCMAKE_INSTALL_LIBDIR=lib \
-DARROW_WITH_BZ2=ON \
-DARROW_WITH_ZLIB=ON \
-DARROW_WITH_ZSTD=ON \
-DARROW_WITH_LZ4=ON \
-DARROW_WITH_SNAPPY=ON \
-DARROW_WITH_BROTLI=ON \
-DARROW_PARQUET=ON \
-DARROW_PYTHON=ON \
-DARROW_BUILD_TESTS=OFF \
-DARROW_WITH_HDFS=OFF \
-DARROW_WITH_IPC=OFF \
..
make -j4
make install
popd

View File

@ -0,0 +1,15 @@
#!/bin/bash
pushd arrow/python
export PYARROW_WITH_PARQUET=1
export PYARROW_WITH_CUDA=0
export PYARROW_WITH_FlIGHT=0
export PYARROW_WITH_DATASET=0
export PYARROW_WITH_ORC=0
export PYARROW_WITH_PLASMA=0
export PYARROW_WITH_S3FS=0
export PYARROW_WITH_HDFS=0
export PYARROW_WITH_GANDIVA=0
# python3 setup.py build_ext --inplace
python3 setup.py install
popd

View File

@ -0,0 +1,23 @@
build :
docker build . --tag pyarrowbuild
run :
docker run -it pyarrowbuild:latest
run-bash :
docker run -it --volume=$(pwd)/build:/home pyarrowbuild:latest /bin/bash
run-volume :
docker run -it -v /home/pirate/pyarrow/build/:/arrow/python/ pyarrowbuild:latest
#sudo docker run -it --volume=$(pwd)/build:/home ubuntu:latest /bin/bash
rm-container :
cont=$(docker ps -a | tail -n 26 | awk '{print $NF}' | sed ':a;N;$!ba;s/\n/ /g')
echo ${cont}
docker rm ${cont}
rm-image :
img=$(docker image ls --quiet | sed ':a;N;$!ba;s/\n/ /g')
docker image rm ${img}

View File

@ -0,0 +1,18 @@
import pyarrow.parquet as pq
import pyarrow.csv as pv
csvfile = 'pressureVacuum.csv'
tb = pv.read_csv(csvfile,parse_options=pv.ParseOptions(delimiter=','))
print(tb)
parquetfile = 'pressureVacuum.parquet'
pq.write_table(tb,parquetfile,compression='BROTLI')
# {'NONE', 'SNAPPY', 'GZIP', 'LZO', 'BROTLI', 'LZ4', 'ZSTD'}
df = pq.read_table(parquetfile,columns=None)
print(df)

8
parquet/pyarrow_arm/sync_pi.sh Executable file
View File

@ -0,0 +1,8 @@
#!/bin/bash
if [ -z "$1" ]
then
exit 1
fi
scp $1 pirate@mf-pi-40:/home/pirate/pyarrow/

View File

@ -1,5 +1,4 @@
include lib/*.hpp
include *.hpp
include *.cpp
include *.pyx
include *.pxd
include VERSION

4
pip/bkup.pyproject.toml Normal file
View File

@ -0,0 +1,4 @@
[build-system]
requires = [
"setuptools"
]

29
pip/makefile Normal file
View File

@ -0,0 +1,29 @@
# --------------------------------------------------------------------------- #
SHELL := /bin/bash
publish: sdist upload
sdist: ../cython/py_imc_termite.pyx ../cython/imc_termite.pxd ../cython/py_imc_termite.cpp
cp -v $? ./
cp -v $(shell ls ../lib/imc_*.hpp) ./
tail -n 212 ../README.md > ./README.md
cp -v ../LICENSE ./
python3 setup.py sdist
# authentication:
# - username: __token__
# - password: <token value including pypi-prefix>
upload:
python3 -m twine upload dist/$(shell ls -t dist/ | head -n1)
clean:
rm -rvf dist/
rm -rvf *.egg-info
rm -rvf build/
rm -rvf cython/
rm -vf *.pyx *.pxd
rm -vf *.cpp *.c *.hpp
rm -vf README.md LICENSE
# --------------------------------------------------------------------------- #

49
pip/setup.py Normal file
View File

@ -0,0 +1,49 @@
from setuptools import setup, Extension
import sys
print("building on platform: "+sys.platform)
if sys.platform == "linux" or sys.platform == "darwin" :
cmpargs = ['-std=c++17','-Wno-unused-variable']
lnkargs = ['-std=c++17','-Wno-unused-variable']
elif sys.platform == "win32" :
cmpargs = ['/EHsc','/std:c++17']
lnkargs = []
else :
raise RuntimeError("unknown platform")
with open("README.md", "r", encoding="utf-8") as fh:
long_description = fh.read()
setup(
name="IMCtermite",
version="1.2.8",
author="Record Evolution GmbH",
author_email="mario.fink@record-evolution.de",
maintainer="Record Evolution GmbH",
license="MIT",
description="Enables extraction of measurement data from binary files with extension 'raw' used by proprietary software imcFAMOS/imcSTUDIO and facilitates its storage in open source file formats",
keywords="IMC raw imcFAMOS imcSTUDIO imcCRONOS",
long_description=long_description,
long_description_content_type="text/markdown",
url="https://github.com/RecordEvolution/IMCtermite.git",
project_urls={
"Bug Tracker": "https://github.com/RecordEvolution/IMCtermite/issues",
},
classifiers=[
"Programming Language :: Python :: 3",
"License :: OSI Approved :: MIT License",
"Operating System :: OS Independent",
],
ext_modules=[Extension("imc_termite",
["py_imc_termite.cpp"],
# libraries_dirs=["cython/"],
# include_dirs=["3rdparty/pugixml/","lib/"],
# depends=["../lib/tdm_termite.hpp"]
language='c++',
extra_compile_args=cmpargs,
extra_link_args=lnkargs,
)
],
)

View File

@ -1 +0,0 @@
2.1.18

137
python/example.py Normal file
View File

@ -0,0 +1,137 @@
#-----------------------------------------------------------------------------#
import raw_eater
import raw_meat
import pyarrow as pa
import pyarrow.parquet as pq
from pathlib import Path
fileobj1 = Path("samples/datasetA/").rglob("*.raw")
rawlist1 = [str(fl) for fl in fileobj1]
fileobj2 = Path("samples/datasetB/").rglob("*.raw")
rawlist2 = [str(fl) for fl in fileobj2]
rawlist = rawlist1 #[rawlist1[0],rawlist1[4],rawlist2[0],rawlist2[6]]
for fil in rawlist2 :
rawlist.append(fil)
rawlist.append("./README.md")
print("")
print(rawlist)
print()
#-----------------------------------------------------------------------------#
# alternatively create "empty" instance of "raw_eater" and set file names
eatraw = raw_eater.raweater()
# eatraw.set_file("../smp/pressure_Vacuum.raw".encode())
# convert every single listed file
for rf in rawlist :
print("converting " + str(rf) + "...\n" + 90*("-") + "\n")
# setup instance of "raw_eater" and trigger conversion
# eatraw = raw_eater.raweater(rf.encode())
# eatraw = raw_meat.rawmerger(rf.encode())
# use global instance of "raw_eater" to set file and perform decoding
eatraw.set_file(rf.encode())
try :
eatraw.do_conversion()
except RuntimeError as e :
print("conversion failed: " + str(e))
# check validity of file format
if eatraw.validity() :
# show channel name and its unit
entity = eatraw.channel_name().decode(encoding='UTF-8',errors='ignore')
unit = eatraw.unit().decode(encoding='UTF-8',errors='ignore')
print("\nentity: " + str(entity))
print("unit: " + str(unit) + "\n")
# obtain extracted data
xt = eatraw.get_time()
yt = eatraw.get_channel()
# show excerpt of data
print("time (length: " + str(len(xt)) + ") \n"
+ str(xt[:10]) + "\n...\n" + str(xt[-10:]) + "\n")
yttrunc = [round(y,4) for y in yt]
print(str(entity) + " (length: " + str(len(yttrunc)) + ") \n"
+ str(yttrunc[:10]) + "\n...\n" + str(yttrunc[-10:]) + "\n")
outname = rf.split('/')[-1].replace('raw','csv')
print("write output to : " + outname)
eatraw.write_table(("output/"+outname).encode(),ord(' '))
else :
print("\nerror: invalid/corrupt .raw file")
print("\n")
#-----------------------------------------------------------------------------#
print("convert and merge channels " + "\n" + 90*("-") + "\n")
# setup new instance to merge channels
eatmea = raw_meat.rawmerger(''.encode()) #rawlist[0].encode())
# add every single channel/file in list
for rf in rawlist :
print("\nadding channel " + str(rf))
try :
succ = eatmea.add_channel(rf.encode())
print("\nrecent time series: length: " + str(len(eatmea.get_time_series())) + "\n")
except RuntimeError as e :
print("failed to add channel: " + str(e))
# show summary of successfully merged channels
print("\nmerged channels:\n")
# write merged table to .csv output
eatmea.write_table_all('output/allchannels.csv'.encode(),ord(','))
# get number of successfully merged channels and their names (+units)
numch = eatmea.get_num_channels()
chnames = [chnm.decode(encoding='UTF-8',errors='ignore') for chnm in eatmea.get_channel_names()]
print("number of channels: " + str(numch))
print("channel names: " + str(chnames))
# obtain final time series
timse = eatmea.get_time_series()
print("\nfinal time series:\nlength:" + str(len(timse)) + "\n")
# get time unit and prepend column name
chnames.insert(0,"Time ["+str(eatmea.time_unit().decode(encoding='UTF-8',errors='ignore'))+"]")
# prepare list of pyarrow arrays
pyarrs = []
pyarrs.append(pa.array(timse))
for i in range(0,numch) :
print("\n" + str(i) + " " + str(chnames[i]))
dat = eatmea.get_channel_by_index(i)
print("length: " + str(len(dat)))
pyarrs.append(pa.array(dat))
print("")
# print("\npyarrow arrays\n" + str(pyarrs))
# create pyarrow table from data
pyarwtab = pa.Table.from_arrays(pyarrs,chnames)
print("\n" + 60*"-" + "\n" + str(pyarwtab) + "\n")
# write pyarrow table to .parquet file with compression
pq.write_table(pyarwtab,'output/allchannels.parquet',compression='BROTLI') # compression='BROTLI', 'SNAPPY')
# try to read and decode the .parquet file
df = pq.read_table('output/allchannels.parquet')
print(df.to_pandas())
# df.to_pandas().to_csv('allchannels.csv',index=False,encoding='utf-8',sep=",")
#-----------------------------------------------------------------------------#

View File

@ -1,43 +0,0 @@
import imctermite
import pandas
import datetime
def add_trigger_time(trigger_time, add_time) :
trgts = datetime.datetime.strptime(trigger_time,'%Y-%m-%dT%H:%M:%S')
dt = datetime.timedelta(seconds=add_time)
return (trgts + dt).strftime('%Y-%m-%dT%H:%M:%S:%f')
if __name__ == "__main__" :
# read file and extract data
imctm = imctermite.imctermite(b"Measurement.raw")
chns = imctm.get_channels(True)
# prepare abscissa
xcol = "time ["+chns[0]['xunit']+"]"
#xcol = "timestamp"
xsts = [add_trigger_time(chns[0]['trigger-time'],tm) for tm in chns[0]['xdata']]
# sort channels
chnnms = sorted([chn['name'] for chn in chns], reverse=False)
chnsdict = {}
for chn in chns :
chnsdict[chn['name']] = chn
# construct dataframe
df = pandas.DataFrame()
df[xcol] = pandas.Series(chns[0]['xdata'])
#df[xcol] = pandas.Series(xsts)
#for idx,chn in enumerate(chns) :
for chnnm in chnnms :
chn = chnsdict[chnnm]
#xcol = (chn['xname'] if chn['xname'] != '' else "x_"+str(idx))+" ["+chn['xunit']+"]"
#df[xcol] = pandas.Series(chn['xdata'])
ycol = chn['yname']+" ["+chn['yunit']+"]"
df[ycol] = pandas.Series(chn['ydata'])
# show entire dataframe and write file
print(df)
df.to_csv("Measurement.csv",header=True,sep='\t',index=False)

View File

@ -1,50 +0,0 @@
import imctermite
import json
import os
import datetime
# declare and initialize instance of "imctermite" by passing a raw-file
try :
imcraw = imctermite.imctermite(b"samples/sampleB.raw")
except RuntimeError as e :
raise Exception("failed to load/parse raw-file: " + str(e))
# obtain list of channels as list of dictionaries (without data)
channels = imcraw.get_channels(False)
print(json.dumps(channels,indent=4, sort_keys=False))
# obtain all channels (including full data)
channelsdata = imcraw.get_channels(True)
# everything that follows is an example that specifically makes use only of
# the first (index = 0) channel ...
idx = 0
if len(channelsdata) > 0 :
# get first channel's data
chnydata = channelsdata[idx]['ydata']
chnxdata = channelsdata[idx]['xdata']
print("xdata: " + str(len(chnxdata)))
print("ydata: " + str(len(chnydata)))
# extract trigger-time
trigtim = datetime.datetime.fromisoformat(channels[idx]["trigger-time"])
print(trigtim)
# file output of data with absolute timestamp in 1st column
filname = os.path.join("./",channelsdata[idx]['name']+".csv")
print("writing output into " + filname)
with open(filname,'w') as fout :
# include column header
fout.write( str(channelsdata[idx]['xname']) + '[' + str(channelsdata[idx]['xunit']) + "]"
+ ","
+ str(channelsdata[idx]['yname']) + '[' + str(channelsdata[idx]['yunit']) + "]"
+ "\n" )
# add data (introduce time shift according to trigger-time)
for row in range(0,len(chnxdata)) :
fout.write( str( (trigtim + datetime.timedelta(seconds=chnxdata[row])).isoformat() )
+ ","
+ str( chnydata[row])
+ "\n" )

View File

@ -1,29 +0,0 @@
import imctermite import imctermite
def show_results(imcraw) :
channels = imcraw.get_channels(False)
print(channels)
channelsData = imcraw.get_channels(True)
print("number of channels: " + str(len(channelsData)))
for (i,chn) in enumerate(channelsData) :
print(str(i) + " | " + chn['name'])
print(chn['xname'] + " | " + chn['xunit'])
print(chn['xdata'][:10])
print(chn['yname'] + " | " + chn['yunit'])
print(chn['ydata'][:10])
print("")
# create instance of 'imctermite'
imcraw = imctermite(b'samples/sampleA.raw')
show_results(imcraw)
# use previous instance of 'imctermite' to provide new file
imcraw.submit_file(b'samples/sampleB.raw')
show_results(imcraw)

View File

@ -1,46 +0,0 @@
setup:
cat ../README.md | grep '^# IMCtermite' -A 50000 > ./README.md
#pandoc -f markdown -t rst -o README.rst README.md
#python -m rstvalidator README.rst
cp -r ../lib ./
cp -v ../LICENSE ./
setup-clean:
rm -vf README.md README.rst LICENSE
rm -rf lib/
build: setup
python setup.py build
build-inplace: setup
python setup.py build_ext --inplace
build-sdist: setup
python setup.py sdist
python -m twine check dist/*
build-bdist: setup
python setup.py bdist
python -m twine check dist/*
build-clean:
python setup.py clean --all
rm -vf imctermite*.so imctermite*.cpp
rm -vf IMCtermite*.so IMCtermite*.cpp
rm -rvf dist/ IMCtermite.egg-info/
rm -rvf dist/ imctermite.egg-info/
cibuildwheel-build: setup
cibuildwheel --platform linux
cibuildwheel-clean:
rm -rvf wheelhouse/
pypi-upload:
python -m twine upload dist/$(shell ls -t dist/ | head -n1)
clean: setup build-clean cibuildwheel-clean setup-clean
run-example:
PYTHONPATH=$(pwd) python examples/usage_files.py

View File

@ -0,0 +1,24 @@
import pyarrow as pa
import numpy as np
import pyarrow.parquet as pq
db = pa.array(np.linspace(10,50,6))
print(db)
da = pa.array(np.linspace(0,5,6))
print(db)
filenam = 'pyarrow_testtab.parquet'
patab = pa.Table.from_arrays([da,db],['entity A [unitA]','entity B [unitB]'])
print(patab)
# pq.write_table(patab,filenam,compression='BROTLI')
pq.write_table(patab,filenam,compression='SNAPPY')
df = pq.read_table(filenam)
print(df)
print(df.to_pandas())
#import readline
#readline.write_history_file('generate_pyarrow_table_and_write_parquet.py')

View File

@ -1,6 +0,0 @@
[build-system]
requires = ["setuptools", "wheel","Cython"]
build-backend = "setuptools.build_meta"
[tool.cibuildwheel]
before-all = ""

View File

@ -1,23 +0,0 @@
[metadata]
name = imctermite
description = Enables extraction of measurement data from binary files with extension 'raw' used by proprietary software imcFAMOS and imcSTUDIO and facilitates its storage in open source file formats
long_description = file: README.md
# long_description_content_type = text/x-rst
long_description_content_type = text/markdown
version = file: VERSION
author = Record Evolution GmbH
author_email = mario.fink@record-evolution.de
maintainer = Record Evolution GmbH
url= https://github.com/RecordEvolution/IMCtermite.git
license = MIT License
license_files = LICENSE
keywords = IMC, raw, imcFAMOS, imcSTUDIO, imcCRONOS
classifiers =
Programming Language :: Python :: 3
License :: OSI Approved :: MIT License
Operating System :: OS Independent
Topic :: Scientific/Engineering
Topic :: Software Development :: Libraries :: Python Modules
[options]

View File

@ -1,21 +0,0 @@
from setuptools import Extension, setup
from Cython.Build import cythonize
import sys
print("building on platform: "+sys.platform)
cmpArgs = {
"linux": ['-std=c++17','-Wno-unused-variable'],
"darwin": ['-std=c++17','-Wno-unused-variable'],
"win32": ['/EHsc','/std:c++17']
}
extension = Extension(
"imctermite",
sources=["imctermite.pyx"],
extra_compile_args=cmpArgs[sys.platform]
)
setup(
ext_modules=cythonize(extension,language_level=3)
)

View File

@ -1,11 +1,11 @@
import imctermite
import imc_termite
import json
import os
# declare and initialize instance of "imctermite" by passing a raw-file
try :
imcraw = imctermite.imctermite(b"samples/exampleB.raw")
imcraw = imc_termite.imctermite(b"samples/exampleB.raw")
except RuntimeError as e :
raise Exception("failed to load/parse raw-file: " + str(e))
@ -21,18 +21,18 @@ if len(channelsdata) > 0 :
print(len(chnydata))
print(len(chnxdata))
print()
# print the channels into a specific directory
imcraw.print_channels(b"/tmp/",ord(','))
imcraw.print_channels(b"./data",ord(','))
# print all channels separately
for i,chn in enumerate(channels) :
print(str(i)+" : "+chn['name']+" : "+chn['uuid'])
filname = os.path.join("/tmp/",str(i) + "_" + chn['name']+".csv")
idx = 0
for chn in channels :
print(str(idx)+" : "+chn['name']+" : "+chn['uuid'])
filname = os.path.join("./data",str(idx) + "_" + chn['name']+".csv")
print(filname)
imcraw.print_channel(chn['uuid'].encode(),filname.encode(),ord(','))
idx = idx + 1
# print all channels in single file
imcraw.print_table(b"/tmp/allchannels.csv")
# imcraw.print_table(b"./data/allchannels.csv")

View File

@ -1,5 +1,5 @@
import imctermite
import imc_termite
import json
import os
@ -15,7 +15,7 @@ for fl in rawlist1:
# declare and initialize instance of "imctermite" by passing a raw-file
try :
imcraw = imctermite.imctermite(fl.encode())
imcraw = imc_termite.imctermite(fl.encode())
except RuntimeError as e :
raise Exception("failed to load/parse raw-file: " + str(e))
@ -24,7 +24,7 @@ for fl in rawlist1:
print(json.dumps(channels,indent=4, sort_keys=False))
# print the channels into a specific directory
imcraw.print_channels(b"./",ord(','))
imcraw.print_channels(b"./")
# print all channels in single file
imcraw.print_table(("./"+str(os.path.basename(fl).split('.')[0])+"_allchannels.csv").encode())

Binary file not shown.

View File

@ -1,14 +0,0 @@
|CF,2,1,1;|CK,1,3,1,1;
|Nv,1,12,7,3,4,64,1,0;
|NO,1,12,1,5,FAMOS,0,;
|NL,1,10,1252,0x407;
|CG,1,5,1,1,1;
|CD,1,13,1,1,1,s,0,0,0;
|NT,1,19, 6, 4,2018,11,33,54;
|CC,1,3,1,1;
|CP,1,16,1,8,8,64,0,0,1,0;
|Cb,1,22,1,0,1,1,0,8,0,8,1,0,0,;
|CR,1,11,0,0,0,1,1,V;
|CN,1,35,0,0,0,7,Average,16,Measurement 0815;
|CT,1,43,0,8,TxTester,8,E. Smith,16,Measurement 0815;
|CS,1,10,1,ÍÌÌÌÌÌ(@;

View File

@ -1,24 +0,0 @@
|CF,2,1,1;|CK,1,3,1,1;
|Nv,1,12,7,3,4,64,1,0;
|NO,1,12,1,5,FAMOS,0,;
|NL,1,10,1252,0x407;
|CB,1,12,1,5,Meas1,0,;
|CG,1,5,1,1,1;
|CD,1,16,5E-1,1,1,2,0,0,0;
|NT,1,19, 6, 4,2018,11,24,18;
|CC,1,3,1,1;
|CP,1,15,1,1,2,8,0,0,1,0;
|Cb,1,22,1,0,1,1,0,3,0,3,1,3,0,;
|CR,1,30,1,3.937007874015748E-2,5,1,1,V;
|ND,1,15,-1,-1,-1,0,1E+1;
|CN,1,16,1,0,0,5,Chan1,0,;
|CG,1,5,1,1,1;
|CD,1,16,5E-1,1,1,2,0,0,0;
|NT,1,19, 6, 4,2018,11,24,18;
|CC,1,3,1,1;
|CP,1,15,2,1,2,8,0,0,1,0;
|Cb,1,22,1,0,2,1,3,3,0,3,1,3,0,;
|CR,1,30,1,3.937007874015748E-2,5,1,1,V;
|ND,1,15,-1,-1,-1,0,1E+1;
|CN,1,16,1,0,0,5,Chan2,0,;
|CS,1,8,1, šš šš;

View File

@ -8,7 +8,7 @@
|CR,1,49,1,3.921568627450980E-2,0.000000000000000E+0,1,1,V;
|ND,1,50,-1,-1,-1,0.000000000000000E+0,1.000000000000000E+1;|CN,1,17,1,0,0,6,kanal2,0,;
|CS,1,8,1,leclass="Code Example" translate="true">
|NO,1,37,1,9,imc-FAMOS,20,Erzeugt:E.Mustermann; |CG,1,5,2,2,2;
|NO,1,37,1,9, imc-FAMOS,20,Erzeugt:E.Mustermann; |CG,1,5,2,2,2;
|CD,1,31,1.000000000000000E-1,1,0,,0,0,0;|NT,1,27,10, 6,1993,19,18,20.0000000;
|CC,1,3,1,1;|CP,1,16,1,4,7,32,0,0,1,0;|Cb,1,40,1,0,1,1,0,16,0,16,1,0.000000000000E+0,0,;
|CR,1,11,0,0,0,1,1,V;|CN,1,20,0,0,0,9,E06_6_121,0,;|CC,1,3,2,1;|CP,1,16,2,4,5,32,0,0,1,0;

View File

@ -1,17 +0,0 @@
|CF,2,1,1;|CK,1,3,1,1;
|Nv,1,12,7,3,4,64,1,0;
|NO,1,12,1,5,FAMOS,0,;
|NL,1,10,1252,0x407;
|CG,1,5,2,2,2;
|CD,1,12,1,1,0,,2,0,0;
|NT,1,18, 6, 4,2018,11,37,1;
|CC,1,3,1,1;
|CP,1,16,1,4,7,32,0,0,1,0;
|Cb,1,24,1,0,1,1,0,16,0,16,1,0,0,;
|CR,1,11,0,0,0,1,1,V;
|CN,1,20,0,0,0,9,MyXY_plot,0,;
|CC,1,3,2,1;
|CP,1,16,2,2,3,16,0,0,1,0;
|Cb,1,23,1,0,2,1,16,8,0,8,1,0,0,;
|CR,1,30,1,4.577706569008927E-5,0,1,1,s;
|CS,1,26,1, @ €? @ffF@ UUªªÿÿ;

Binary file not shown.

View File

@ -12,7 +12,6 @@
const std::string gittag("TAGSTRING");
const std::string githash("HASHSTRING");
const std::string timestamp("TIMESTAMPSTRING");
//---------------------------------------------------------------------------//
@ -128,13 +127,13 @@ optkeys parse_args(int argc, char* argv[], bool list_args = false)
void show_version()
{
std::cout<<"imctermite ["<<gittag<<"-g"<<githash<<"-"<<timestamp<<"]"<<"\n";
std::cout<<"imctermite ["<<gittag<<"-g"<<githash<<"]"<<"\n";
}
void show_usage()
{
std::cout<<"\n"
<<"imctermite ["<<gittag<<"-g"<<githash<<"-"<<timestamp<<"] (https://github.com/RecordEvolution/IMCtermite.git)"
<<"imctermite ["<<gittag<<"-g"<<githash<<"] (https://github.com/RecordEvolution/IMCtermite.git)"
<<"\n\n"
<<"Decode IMC raw files and dump data as *.csv"
<<"\n\n"