Fix the sex of parents, add parents that are missing from the data. Can be used with a dataframe or a vector of the different individuals informations.
Usage
# S4 method for class 'character'
fix_parents(obj, dadid, momid, sex, famid = NULL, missid = NA_character_)
# S4 method for class 'data.frame'
fix_parents(obj, del_parents = NULL, filter = NULL, missid = NA_character_)
Arguments
- obj
A data.frame or a vector of the individuals identifiers. If a dataframe is given it must contain the columns
id
,dadid
,momid
,sex
andfamid
(optional).- dadid
A vector containing for each subject, the identifiers of the biologicals fathers.
- momid
A vector containing for each subject, the identifiers of the biologicals mothers.
- sex
A character, factor or numeric vector corresponding to the gender of the individuals. This will be transformed to an ordered factor with the following levels:
male
<female
<unknown
<terminated
The following values are recognized:character() or factor() : "f", "m", "woman", "man", "male", "female", "unknown", "terminated"
numeric() : 1 = "male", 2 = "female", 3 = "unknown", 4 = "terminated"
- famid
A character vector with the family identifiers of the individuals. If provide, will be aggregated to the individuals identifiers separated by an underscore.
- missid
A character vector with the missing values identifiers. All the id, dadid and momid corresponding to those values will be set to
NA_character_
.- del_parents
Boolean defining if missing parents needs to be deleted or fixed. If
one
then if one of the parent is missing, both are removed, ifboth
then if both parents are missing, both are removed. IfNULL
then no parent is removed and the missing parents are added as new rows.- filter
Filtering column containing
0
or1
for the rows to kept before proceeding.
Details
First look to add parents whose ids are given in momid/dadid. Second, fix
sex of parents. Last look to add second parent for children for whom only
one parent id is given.
If a famid vector is given the family id will be added to the
ids of all individuals (id
, dadid
, momid
)
separated by an underscore before proceeding.
Special case for dataframe
Check for presence of both parents id in the id field. If not both presence behaviour depend of delete parameter
If
TRUE
then use fix_parents function and merge back the other fields in the dataframe then set availability toO
for non available parents.If
FALSE
then delete the id of missing parents
Examples
test1char <- data.frame(
id = paste('fam', 101:111, sep = ''),
sex = c('male', 'female')[c(1, 2, 1, 2, 1, 1, 2, 2, 1, 2, 1)],
father = c(
0, 0, 'fam101', 'fam101', 'fam101', 0, 0,
'fam106', 'fam106', 'fam106', 'fam109'
),
mother = c(
0, 0, 'fam102', 'fam102', 'fam102', 0, 0,
'fam107', 'fam107', 'fam107', 'fam112'
)
)
test1newmom <- with(test1char, fix_parents(id, father, mother,
sex,
missid = NA_character_
))
Pedigree(test1newmom)
#> Warning: The Pedigree informations are not valid. Here is the normalised Pedigree informations with the identified problems
#> indId motherId fatherId gender family sex steril status avail id
#> 1 fam101 0 0 1 1 male NA NA NA 1_fam101
#> 2 fam102 0 0 2 1 female NA NA NA 1_fam102
#> 3 fam103 fam102 fam101 1 1 male NA NA NA 1_fam103
#> 4 fam104 fam102 fam101 2 1 female NA NA NA 1_fam104
#> 5 fam105 fam102 fam101 1 1 male NA NA NA 1_fam105
#> 6 fam106 0 0 1 1 male NA NA NA 1_fam106
#> 7 fam107 0 0 2 1 female NA NA NA 1_fam107
#> 8 fam108 fam107 fam106 2 1 female NA NA NA 1_fam108
#> 9 fam109 fam107 fam106 1 1 male NA NA NA 1_fam109
#> 10 fam110 fam107 fam106 2 1 female NA NA NA 1_fam110
#> 11 fam111 fam112 fam109 1 1 male NA NA NA 1_fam111
#> 12 0 <NA> <NA> 1 1 female NA NA NA 1_0
#> 13 0 <NA> <NA> 2 <NA> female NA NA NA 0
#> 14 fam112 <NA> <NA> 2 1 female NA NA NA 1_fam112
#> dadid momid famid error affected
#> 1 1_0 1_0 1 <NA> NA
#> 2 1_0 1_0 1 <NA> NA
#> 3 1_fam101 1_fam102 1 <NA> NA
#> 4 1_fam101 1_fam102 1 <NA> NA
#> 5 1_fam101 1_fam102 1 <NA> NA
#> 6 1_0 1_0 1 <NA> NA
#> 7 1_0 1_0 1 <NA> NA
#> 8 1_fam106 1_fam107 1 <NA> NA
#> 9 1_fam106 1_fam107 1 <NA> NA
#> 10 1_fam106 1_fam107 1 <NA> NA
#> 11 1_fam109 1_fam112 1 <NA> NA
#> 12 <NA> <NA> 1 isMotherAndFather_isFatherButNotMale NA
#> 13 <NA> <NA> <NA> <NA> NA
#> 14 <NA> <NA> 1 <NA> NA
#> available sterilisation vitalStatus affection
#> 1 NA NA NA NA
#> 2 NA NA NA NA
#> 3 NA NA NA NA
#> 4 NA NA NA NA
#> 5 NA NA NA NA
#> 6 NA NA NA NA
#> 7 NA NA NA NA
#> 8 NA NA NA NA
#> 9 NA NA NA NA
#> 10 NA NA NA NA
#> 11 NA NA NA NA
#> 12 NA NA NA NA
#> 13 NA NA NA NA
#> 14 NA NA NA NA