Wednesday, November 27, 2024

quick filter in awk

 make a list of animals in col2 of 1st file, then print 2nd file if animals ar NOT in the list


awk 'FNR==NR{inn[$2]; next}  !($2 in inn)' test5 relGenApprox

Friday, October 18, 2024

python like awk

 so I want to put Python in a pipe where it reads from std in , does something, writes to std out:


$ cat test.py

#!/usr/bin/env python3

import sys

# https://stackoverflow.com/questions/1450393/how-do-i-read-from-stdin

for n,line in enumerate(sys.stdin):

    a=line.split()

    if n%5==0:

        sys.stdout.write(line)

Friday, March 1, 2024

extract UPG allocation from renumf90 log

 so you want to know from the output of renumf90 (say ren.log) how many animals were allocated to each UPG?

The output looks like this:

 Unknown parent group allocation

 Equation   Group       #Animals

 59815785       1   21349

...

 Max group = 424; Max UPG ID = 59816208

...

Use this in the command line:

$ sed -n '/allocation/,/Max\ group/p' ren.log | awk '$1 ~ /^[0-9]+$/' 

e.g.

 59815785         21349

 59815786       2    4615

explanations:

find between two patterns:

https://askubuntu.com/a/849016

check if the 1st column is a (positive) integer: 

https://stackoverflow.com/questions/28878995/check-if-a-field-is-an-integer-in-awk



Tuesday, January 23, 2024

Dictionary of arrays in Julia and breed fractions

 So in Julia I want to create a dictionary (hash table or associative array) of arrays to store breed composition in dairy cattle. It took me a while but I found out how to declare it:

julia> a=Dict{String,Array{Float64,1}}()

Imagine the following pedigree

A 0 0 Holstein
B 0 0 Jersey
C A B
D A C

then this Julia script computes breed fractions

#=
A 0 0 Holstein
B 0 0 Jersey
C A B
D A C
=#

breedcomp=Dict{String,Array{Float64,1}}()

#purebred founders
breedcomp["A"]=[1.0,0.0]
breedcomp["B"]=[0.0,1.0]
#rest of pedigree
breedcomp["C"]=0.5*(breedcomp["A"]+breedcomp["B"])
breedcomp["D"]=0.5*(breedcomp["A"]+breedcomp["C"])

display(breedcomp)


Dict{String, Vector{Float64}} with 4 entries:

  "B" => [0.0, 1.0]

  "A" => [1.0, 0.0]

  "C" => [0.5, 0.5]

  "D" => [0.75, 0.25]