Fix the sex of parents, add parents that are missing from the data. Can be used with a dataframe or a vector of the different individuals informations.
Usage
# S4 method for class 'character'
fix_parents(obj, dadid, momid, sex, famid = NULL, missid = NA_character_)
# S4 method for class 'data.frame'
fix_parents(obj, del_parents = NULL, filter = NULL, missid = NA_character_)
Arguments
- obj
A data.frame or a vector of the individuals identifiers. If a dataframe is given it must contain the columns
id
,dadid
,momid
,sex
andfamid
(optional).- dadid
A vector containing for each subject, the identifiers of the biologicals fathers.
- momid
A vector containing for each subject, the identifiers of the biologicals mothers.
- sex
A character, factor or numeric vector corresponding to the gender of the individuals. This will be transformed to an ordered factor with the following levels:
male
<female
<unknown
The following values are recognized:
"male": "m", "male", "man",
1
"female": "f", "female", "woman",
2
"unknown": "unknown",
3
- famid
A character vector with the family identifiers of the individuals. If provide, will be aggregated to the individuals identifiers separated by an underscore.
- missid
A character vector with the missing values identifiers. All the id, dadid and momid corresponding to those values will be set to
NA_character_
.- del_parents
Boolean defining if missing parents needs to be deleted or fixed. If
one
then if one of the parent is missing, both are removed, ifboth
then if both parents are missing, both are removed. IfNULL
then no parent is removed and the missing parents are added as new rows.- filter
Filtering column containing
0
or1
for the rows to kept before proceeding.
Details
First look to add parents whose ids are given in momid/dadid. Second, fix
sex of parents. Last look to add second parent for children for whom only
one parent id is given.
If a famid vector is given the family id will be added to the
ids of all individuals (id
, dadid
, momid
)
separated by an underscore before proceeding.
Special case for dataframe
Check for presence of both parents id in the id field. If not both presence behaviour depend of delete parameter
If
TRUE
then use fix_parents function and merge back the other fields in the dataframe then set availability toO
for non available parents.If
FALSE
then delete the id of missing parents
Examples
test1char <- data.frame(
id = paste('fam', 101:111, sep = ''),
sex = c('male', 'female')[c(1, 2, 1, 2, 1, 1, 2, 2, 1, 2, 1)],
father = c(
0, 0, 'fam101', 'fam101', 'fam101', 0, 0,
'fam106', 'fam106', 'fam106', 'fam109'
),
mother = c(
0, 0, 'fam102', 'fam102', 'fam102', 0, 0,
'fam107', 'fam107', 'fam107', 'fam112'
)
)
test1newmom <- with(test1char, fix_parents(id, father, mother,
sex,
missid = NA_character_
))
Pedigree(test1newmom)
#> Warning: The Pedigree informations are not valid. Here is the normalised Pedigree informations with the identified problems
#> id momid dadid sex famid
#> 1 1_fam101 1_0 1_0 male 1
#> 2 1_fam102 1_0 1_0 female 1
#> 3 1_fam103 1_fam102 1_fam101 male 1
#> 4 1_fam104 1_fam102 1_fam101 female 1
#> 5 1_fam105 1_fam102 1_fam101 male 1
#> 6 1_fam106 1_0 1_0 male 1
#> 7 1_fam107 1_0 1_0 female 1
#> 8 1_fam108 1_fam107 1_fam106 female 1
#> 9 1_fam109 1_fam107 1_fam106 male 1
#> 10 1_fam110 1_fam107 1_fam106 female 1
#> 11 1_fam111 1_fam112 1_fam109 male 1
#> 12 1_0 <NA> <NA> female 1
#> 13 0 <NA> <NA> female <NA>
#> 14 1_fam112 <NA> <NA> female 1
#> error fertility miscarriage deceased
#> 1 <NA> fertile FALSE NA
#> 2 <NA> fertile FALSE NA
#> 3 <NA> fertile FALSE NA
#> 4 <NA> fertile FALSE NA
#> 5 <NA> fertile FALSE NA
#> 6 <NA> fertile FALSE NA
#> 7 <NA> fertile FALSE NA
#> 8 <NA> fertile FALSE NA
#> 9 <NA> fertile FALSE NA
#> 10 <NA> fertile FALSE NA
#> 11 <NA> fertile FALSE NA
#> 12 is-mother-and-father_is-father-but-not-male fertile FALSE NA
#> 13 <NA> fertile FALSE NA
#> 14 <NA> fertile FALSE NA
#> avail evaluated consultand proband carrier asymptomatic adopted dateofbirth
#> 1 NA FALSE FALSE FALSE NA NA FALSE <NA>
#> 2 NA FALSE FALSE FALSE NA NA FALSE <NA>
#> 3 NA FALSE FALSE FALSE NA NA FALSE <NA>
#> 4 NA FALSE FALSE FALSE NA NA FALSE <NA>
#> 5 NA FALSE FALSE FALSE NA NA FALSE <NA>
#> 6 NA FALSE FALSE FALSE NA NA FALSE <NA>
#> 7 NA FALSE FALSE FALSE NA NA FALSE <NA>
#> 8 NA FALSE FALSE FALSE NA NA FALSE <NA>
#> 9 NA FALSE FALSE FALSE NA NA FALSE <NA>
#> 10 NA FALSE FALSE FALSE NA NA FALSE <NA>
#> 11 NA FALSE FALSE FALSE NA NA FALSE <NA>
#> 12 NA FALSE FALSE FALSE NA NA FALSE <NA>
#> 13 NA FALSE FALSE FALSE NA NA FALSE <NA>
#> 14 NA FALSE FALSE FALSE NA NA FALSE <NA>
#> dateofdeath
#> 1 <NA>
#> 2 <NA>
#> 3 <NA>
#> 4 <NA>
#> 5 <NA>
#> 6 <NA>
#> 7 <NA>
#> 8 <NA>
#> 9 <NA>
#> 10 <NA>
#> 11 <NA>
#> 12 <NA>
#> 13 <NA>
#> 14 <NA>