initial commit

This commit is contained in:
Mario Fink 2020-02-06 16:24:20 +00:00
commit 20566df9ca
6 changed files with 337 additions and 0 deletions

78
README.md Normal file
View File

@ -0,0 +1,78 @@
# raw_eater
The _raw_eater_ package is used to parse files with extension `*.raw`, which
are usually binary files produced by the labsoftware _Famos_ to dump measurement
time series.
## File Structure
The binary `*.raw` file features a series of markers that indicate the starting
point of various blocks of information. Every markers is introduced by character
"|" = `0x 7c` followed by two uppercase letters, which characterize the type of
marker. The following markers are defined:
1. CF (0x 43 46)
1. CK (0x 43 4b)
1. NO (0x 4e 4f)
1. CG (0x 43 47)
1. CD (0x 43 44)
1. NT (0x 4e 54)
1. CC (0x 43 43)
1. CP (0x 43 50)
1. CR (0x 43 52)
1. CN (0x 43 4e)
1. Cb (0x 43 62)
1. CS (0x 43 53)
Each of these markers are followed by multiple commata (0x 2c) separated parameters
and are terminated by a semicolon `;` = 0x 3b, except for the sequence following
the data marker CS, that may have any number of 0x3b occurencies, while still
terminated by a semicolon at the very end of the file (since CS is the last marker
section in the file). The markers have the following meaning:
- *CF* (mostly 4 parameters)
this marker is mostly just `|CF,2,1,1;` and hence I've got no fucking
idea what it actually means!
- *CK* (mostly 4 parameters)
same problem for this one: it always seems to be `|CK,1,3,1,1;` ...
- *NO* (mostly 6 parameters)
provides some info about the software package/device and its version that
produced the file, e.g. something like
`|NO,1,85,0,77,imc STUDIO 5.0 R3 (10.09.2015)@imc DEVICES 2.8R7 (26.8.2015)@imcDev__15190567,0,;`
- *CG* (mostly 5 parameters)
another one of these apparently useless markers, looks for instance like
`|CG,1,5,1,1,1;`
- *CD* (mostly 11 parameters)
since we're dealing with measured entities from the lab this markers contains
info about the measurement frequency, i.e. sample rate. For instance
`|CD,2, 63, 5.0000000000000001E-03,1,1,s,0,0,0, 0.0000000000000000E+00,1;`
indicates a measured entity every 0.005 seconds, i.e. a sample rate = 200Hz
- *NT* (mostly 8 parameters)
whatever ?!? for instance `|NT,1,16,1,1,1980,0,0,0.0;`
maybe it's the datatype ??
- *CC* (mostly 4 parameters)
`|CC,1,3,1,1;`
- *CP* (mostly 10 parameters)
`|CP,1,16,1,4,7,32,0,0,1,0;`
- *CR* (mostly 8 parameters)
provides the _physical unit_ of the measured entity, maybe shows the
minimum and maximum value during the measurment, e.g.
`|CR,1,60,0, 1.0000000000000000E+00, 0.0000000000000000E+00,1,4,mbar;`
- *CN* (mostly 9 parameters)
gives the _name_ of the measured entity
`|CN,1,27,0,0,0,15,pressure_Vacuum,0,;`
- *Cb* (mostly 14 paramters) (optional?)
this one probably gives the minimum/maximum measured values!!
`|Cb,1,117,1,0,1,1,0,341288,0,341288,1,0.0000000000000000E+00,1.1781711390000000E+09,;`
- *CS* (mostly 4 parameters)
this markers announces the actual measurement data in binary format,
provide the number of values and the actual data,
e.g. `|CS,1, 341299, 1, ...data... ;`
## Open Issues and question?
- which parameter indicate(s) little vs. big endian?

12
check_markers.sh Executable file
View File

@ -0,0 +1,12 @@
##!/bin/bash/
dir=$1
#ls ${dir} | while read fn; do echo $fn; cat ${dir}$fn | grep -a "|[A-Z][A-Z]," -o | wc -l; done;
#ls ${dir} | while read fn; do echo $fn; cat ${dir}$fn | grep -a "|[A-Z][A-Z]," -o; done;
#ls ${dir} | while read fn; do echo $fn; cat ${dir}$fn | xxd | head -n10 | tail -n3; done;
ls ${dir} | while read fn; do echo $fn; cat ${dir}$fn | grep -a "|[A-Z][a-zA-Z]," -o | wc -l; done;
ls ${dir} | while read fn; do echo $fn; cat ${dir}$fn | grep -a "|[A-Z][a-zA-Z]," -o; done;

BIN
eatit Executable file

Binary file not shown.

14
makefile Normal file
View File

@ -0,0 +1,14 @@
RAW = ../raw/
SRC = src/
EXE = eatit
CCC = g++
OPT = -O3 -Wall
$(EXE) : $(SRC)main.cpp $(SRC)raweat.hpp
$(CCC) $(OPT) $< -o $@
clean :
rm -f $(EXE)

85
src/main.cpp Normal file
View File

@ -0,0 +1,85 @@
//---------------------------------------------------------------------------//
#include "../src/raweat.hpp"
//---------------------------------------------------------------------------//
int main(int argc, char* argv[])
{
// path of filename provided ?
assert( argc > 1 && "please provide a filename and path" );
std::cout<<"number of CLI-arguments: "<<argc<<"\n";
for ( int i = 0; i < argc; i++ ) std::cout<<std::setw(5)<<i<<": "<<argv[i]<<"\n";
// check number of CLI arguments
assert( argc == 2 );
// get name/path of file from CLI argument
std::string rawfile(argv[1]);
// declare instance of "raw_eater"
raw_eater eatraw(rawfile);
eatraw.show_markers();
eatraw.find_markers();
// tdm_ripper ripper(argv[1],argv[2],false); //,"samples/SineData.tdx",false);
//
// // ripper.list_datatypes();
// // ripper.show_structure();
//
// ripper.print_hash_local("data/hash_table_xml_local.dat");
// ripper.print_hash_values("data/hash_table_xml_value.dat");
// ripper.print_hash_double("data/hash_table_xml_double.dat");
// ripper.print_extid("data/channel_ext_id.dat");
//
// ripper.list_groups();
// std::ofstream gout("data/list_of_groups.dat");
// ripper.list_groups(gout);
// gout.close();
//
// ripper.list_channels();
// std::ofstream fout("data/list_of_channels.dat");
// ripper.list_channels(fout);
// fout.close();
//
// // long int nsa = 6.3636349745e10; // expected result: 22.07.2016, 19:49:05
// // long int nsb = 6.3636350456e10;
// // std::string ts = std::to_string(nsa);
// // std::cout<<ripper.unix_timestamp(ts)<<"\n";
//
// std::cout<<"number of channels "<<ripper.num_channels()<<"\n";
// std::cout<<"number of groups "<<ripper.num_groups()<<"\n\n";
//
// // obtain some specific meta information tags
// std::cout<<"\n"<<ripper.get_meta("SMP_Name")<<"\n";
// std::cout<<ripper.get_meta("SMP_Aufbau_Nr")<<"\n";
// std::cout<<ripper.get_meta("SMP_Type")<<"\n";
// std::cout<<ripper.get_meta("Location")<<"\n\n";
//
// // print all meta information
// ripper.print_meta("data/meta_info.csv");
//
// // for ( int i = 0; i < ripper.num_groups(); i++ )
// // {
// // std::cout<<std::setw(10)<<i+1<<std::setw(10)<<ripper.no_channels(i+1)<<"\n";
// // for ( int j = 0; j < ripper.no_channels(i+1); j++ )
// // {
// // std::cout<<std::setw(10)<<i+1<<std::setw(10)<<j+1<<std::setw(30)<<ripper.channel_name(i+1,j+1)<<"\n";
// // }
// // }
//
// // for ( int i = 3; i < 10; i++ )
// for ( int i = 0; i < ripper.num_channels(); i++ )
// // for ( int i = 11880; i < ripper.num_channels(); i++ )
// {
// ripper.print_channel(i,("data/channel_"+std::to_string(i+1)+"_"
// +ripper.channel_name(i)+".dat").c_str());
// }
return 0;
}
//---------------------------------------------------------------------------//

148
src/raweat.hpp Normal file
View File

@ -0,0 +1,148 @@
//---------------------------------------------------------------------------//
#include <assert.h>
#include <iostream>
#include <fstream>
#include <iomanip>
#include <vector>
#include <iterator>
#include <map>
//---------------------------------------------------------------------------//
class raw_eater
{
private:
// filename and path
std::string rawfile_;
// raw buffer
std::vector<unsigned char> rawdata_;
// file format markers
std::map<std::string,std::vector<unsigned char>> markers_ = {
{"intro marker",{0x7c,0x43,0x46}},
{"fileo marker",{0x7c,0x43,0x4b}},
{"vendo marker",{0x7c,0x4e,0x4f}},
{"param marker",{0x7c,0x43,0x47}},
{"sampl marker",{0x7c,0x43,0x44}},
{"typei marker",{0x7c,0x4e,0x54}},
{"dimen marker",{0x7c,0x43,0x43}},
{"datyp marker",{0x7c,0x43,0x50}},
{"punit marker",{0x7c,0x43,0x52}},
{"ename marker",{0x7c,0x43,0x4e}},
{"minma marker",{0x7c,0x43,0x62}},
{"datas marker",{0x7c,0x43,0x53}}
};
// data sections corresponding to markers
std::map<std::string,std::vector<unsigned char>> datasec_;
public:
// constructor
raw_eater(std::string rawfile) : rawfile_(rawfile)
{
// open file and put data in buffer
std::ifstream fin(rawfile.c_str(),std::ifstream::binary);
assert( fin.good() && "failed to open file" );
try {
std::ifstream fin(rawfile.c_str(),std::ifstream::binary);
}
catch (std::ifstream::failure e) {
std::cerr<<"opening file " + rawfile + " failed";
}
std::vector<unsigned char> rawdata((std::istreambuf_iterator<char>(fin)),
(std::istreambuf_iterator<char>()));
rawdata_ = rawdata;
// show size of buffer
std::cout<<"size of buffer "<<rawdata_.size()<<"\n";
// show excerpt from buffer
int ista = 0, iend = 128;
for ( int i= ista; i < iend; i++ )
{
std::cout<<std::hex<<(int)rawdata_[i]<<" ";
if ( (i+1)%16 == 0 ) std::cout<<"\n";
}
std::cout<<"\n";
}
// destructor
~raw_eater()
{
}
// show predefined markers
void show_markers()
{
std::cout<<"\n";
for ( auto el: markers_ )
{
std::cout<<el.first<<" ";
for ( unsigned char c: el.second) std::cout<<std::hex<<int(c);
std::cout<<"\n";
}
}
// find predefined markers in data buffer
void find_markers()
{
for (std::pair<std::string,std::vector<unsigned char>> mrk : markers_ )
{
assert( mrk.second.size() > 0 && "please don't defined any empty marker" );
// find marker's byte sequence in buffer
for ( unsigned long int idx = 0; idx < rawdata_.size(); idx++ )
{
bool gotit = true;
for ( unsigned long int mrkidx = 0; mrkidx < mrk.second.size() && gotit; mrkidx ++ )
{
if ( ! (mrk.second[mrkidx] == rawdata_[idx+mrkidx]) ) gotit = false;
}
// if we got the marker, collect following bytes until end of marker byte 0x 3b
if ( gotit )
{
// array of data associated to marker
std::vector<unsigned char> markseq;
if ( mrk.first != "datas marker" )
{
// collect bytes until we find semicolon ";", i.e. 0x3b
int seqidx = 0;
while ( rawdata_[idx+seqidx] != 0x3b )
{
markseq.push_back(rawdata_[idx+seqidx]);
seqidx++;
}
}
else
{
// make sure the data marker is actually the last and extends until end of file
//assert( TODO && "data marker doesn't appear to be the very last");
// that's the data itself
for ( unsigned long int didx = idx; didx < rawdata_.size()-1; didx++ )
{
markseq.push_back(rawdata_[didx]);
}
}
}
}
}
for (std::pair<std::string,std::vector<unsigned char>> mrk : markers_ )
{
std::cout<<mrk.first<<" "<<mrk.second.size()<<"\n";
}
}
};
//---------------------------------------------------------------------------//