make a list of animals in col2 of 1st file, then print 2nd file if animals ar NOT in the list
awk 'FNR==NR{inn[$2]; next} !($2 in inn)' test5 relGenApprox
geek and genetics stuff
make a list of animals in col2 of 1st file, then print 2nd file if animals ar NOT in the list
awk 'FNR==NR{inn[$2]; next} !($2 in inn)' test5 relGenApprox
so I want to put Python in a pipe where it reads from std in , does something, writes to std out:
$ cat test.py
#!/usr/bin/env python3
import sys
# https://stackoverflow.com/questions/1450393/how-do-i-read-from-stdin
for n,line in enumerate(sys.stdin):
a=line.split()
if n%5==0:
sys.stdout.write(line)
so you want to know from the output of renumf90 (say ren.log) how many animals were allocated to each UPG?
The output looks like this:
Unknown parent group allocation
Equation Group #Animals
59815785 1 21349
...
Max group = 424; Max UPG ID = 59816208
...
Use this in the command line:
$ sed -n '/allocation/,/Max\ group/p' ren.log | awk '$1 ~ /^[0-9]+$/'
e.g.
59815785 1 21349
59815786 2 4615
explanations:
find between two patterns:
https://askubuntu.com/a/849016
check if the 1st column is a (positive) integer:
https://stackoverflow.com/questions/28878995/check-if-a-field-is-an-integer-in-awk
So in Julia I want to create a dictionary (hash table or associative array) of arrays to store breed composition in dairy cattle. It took me a while but I found out how to declare it:
julia> a=Dict{String,Array{Float64,1}}()
Imagine the following pedigree
then this Julia script computes breed fractions
Dict{String, Vector{Float64}} with 4 entries:
"B" => [0.0, 1.0]
"A" => [1.0, 0.0]
"C" => [0.5, 0.5]
"D" => [0.75, 0.25]
APY is a technique that allows representing the inverse of a genomic relationship matrix in a sparse format by choosing a "core" of animals (here and here). The authoritative guide to APY is Bermann et al. 2022: this paper .
One of the key aspects of genomic models is the ability of estimating SNP effects and then apply them to newly genotyped animals, what is know as Indirect Predictions. Matias and I found out that it's easier than we thought as shown here.
Among many other things Bermann et al. show that one can write indirect predictions of "non-core" animals as (I use the original equation numbering)
eq. (10) : $latex \mathbf{{u}_n}= \mathbf{Z}_n \mathbf{Z}'_c (\mathbf{Z}_c \mathbf{Z}'_c)^{-1} \mathbf{Z}_c \mathbf{{a}} + \boldsymbol{\xi} $latex
where $latex \mathbf{{a}} $latex are SNP effects and $latex \boldsymbol{\xi} $latex is an error term that does not depend on $latex \mathbf{{u}_c} $latex.
we obtain SNP effects from eq. 21 in Bermann et al:
$latex \mathbf{\hat{a}}=k \mathbf{Z}'_c \mathbf{G}_{cc}^{-1} \mathbf{\hat{u}}_{c} $latex
we plug that into (10) and expand:
$latex \mathbf{\hat{u}_n}= \mathbf{Z}_n \mathbf{Z}'_c (\mathbf{Z}_c \mathbf{Z}'_c)^{-1} \mathbf{Z}_c \mathbf{\hat{a}} =k \mathbf{Z}_n \mathbf{Z}'_c (k \mathbf{Z}_c \mathbf{Z}'_c)^{-1} \mathbf{Z}_c \mathbf{\hat{a}} $latex
we substitute for $latex \mathbf{\hat{a}} $latex:
$latex \mathbf{\hat{u}_n} =k \mathbf{Z}_n \mathbf{Z}'_c (k \mathbf{Z}_c \mathbf{Z}'_c)^{-1} \mathbf{Z}_c k \mathbf{Z}'_c \mathbf{G}_{cc}^{-1} \mathbf{\hat{u}}_{c}=k \mathbf{Z}_n \mathbf{Z}'_c \mathbf{G}_{cc}^{-1} \mathbf{\hat{u}}_{c} = \mathbf{Z}_n \mathbf{\hat{a}} $latex
which is the original eq. 21. This is because $latex \hat{a} $latex is part of the column space of $latex \mathbf{Z}'_c $latex, then $latex \mathcal{P}\mathbf{\hat{a}} = \mathbf{\hat{a}} $latex.
Finally, obtaining Indirect Predictions from APY is deadly simple and intuitive:
I wanted to compile the coalescent simulator macs in the Mac running Ventura with M2 chips. This was of course the beginning of a great adventure. I end up doing the following:
- install library boost using homebrew
- dig out the path for the boost library e.g. as here, which turned out to be /opt/homebrew/Cellar/boost/1.81.0_1/include
- finally, modify the makefile as follows
# a vector of frequencies
p=rand(Beta(2,2),10)
# a vector of "distributions"
a=Binomial.(2,p)
# a vector of vectors
b=rand.(a,1)
# collapsed (row vector)
reduce(hcat,b)
# draw 5 animals at once
b=rand.(a,5)
# collapse
bb=reduce(hcat,b)
pest = mean(bb,dims=1)/2