Skip to contents

Fix the sex of parents, add parents that are missing from the data. Can be used with a dataframe or a vector of the different individuals informations.

Usage

# S4 method for class 'character'
fix_parents(obj, dadid, momid, sex, famid = NULL, missid = NA_character_)

# S4 method for class 'data.frame'
fix_parents(obj, del_parents = NULL, filter = NULL, missid = NA_character_)

Arguments

obj

A data.frame or a vector of the individuals identifiers. If a dataframe is given it must contain the columns id, dadid, momid, sex and famid (optional).

dadid

A vector containing for each subject, the identifiers of the biologicals fathers.

momid

A vector containing for each subject, the identifiers of the biologicals mothers.

sex

A character, factor or numeric vector corresponding to the gender of the individuals. This will be transformed to an ordered factor with the following levels: male < female < unknown

The following values are recognized:

  • "male": "m", "male", "man", 1

  • "female": "f", "female", "woman", 2

  • "unknown": "unknown", 3

famid

A character vector with the family identifiers of the individuals. If provide, will be aggregated to the individuals identifiers separated by an underscore.

missid

A character vector with the missing values identifiers. All the id, dadid and momid corresponding to those values will be set to NA_character_.

del_parents

Boolean defining if missing parents needs to be deleted or fixed. If one then if one of the parent is missing, both are removed, if both then if both parents are missing, both are removed. If NULL then no parent is removed and the missing parents are added as new rows.

filter

Filtering column containing 0 or 1 for the rows to kept before proceeding.

Value

A data.frame with id, dadid, momid, sex as columns with the relationships fixed.

Details

First look to add parents whose ids are given in momid/dadid. Second, fix sex of parents. Last look to add second parent for children for whom only one parent id is given. If a famid vector is given the family id will be added to the ids of all individuals (id, dadid, momid) separated by an underscore before proceeding.

Special case for dataframe

Check for presence of both parents id in the id field. If not both presence behaviour depend of delete parameter

  • If TRUE then use fix_parents function and merge back the other fields in the dataframe then set availability to O for non available parents.

  • If FALSE then delete the id of missing parents

Author

Jason Sinnwell

Examples


test1char <- data.frame(
    id = paste('fam', 101:111, sep = ''),
    sex = c('male', 'female')[c(1, 2, 1, 2, 1, 1, 2, 2, 1, 2, 1)],
    father = c(
        0, 0, 'fam101', 'fam101', 'fam101', 0, 0,
        'fam106', 'fam106', 'fam106', 'fam109'
    ),
    mother = c(
        0, 0, 'fam102', 'fam102', 'fam102', 0, 0,
        'fam107', 'fam107', 'fam107', 'fam112'
    )
)
test1newmom <- with(test1char, fix_parents(id, father, mother,
    sex,
    missid = NA_character_
))
Pedigree(test1newmom)
#> Warning: The Pedigree informations are not valid. Here is the normalised Pedigree informations with the identified problems
#>          id    momid    dadid    sex famid
#> 1  1_fam101      1_0      1_0   male     1
#> 2  1_fam102      1_0      1_0 female     1
#> 3  1_fam103 1_fam102 1_fam101   male     1
#> 4  1_fam104 1_fam102 1_fam101 female     1
#> 5  1_fam105 1_fam102 1_fam101   male     1
#> 6  1_fam106      1_0      1_0   male     1
#> 7  1_fam107      1_0      1_0 female     1
#> 8  1_fam108 1_fam107 1_fam106 female     1
#> 9  1_fam109 1_fam107 1_fam106   male     1
#> 10 1_fam110 1_fam107 1_fam106 female     1
#> 11 1_fam111 1_fam112 1_fam109   male     1
#> 12      1_0     <NA>     <NA> female     1
#> 13        0     <NA>     <NA> female  <NA>
#> 14 1_fam112     <NA>     <NA> female     1
#>                                          error fertility miscarriage deceased
#> 1                                         <NA>   fertile       FALSE       NA
#> 2                                         <NA>   fertile       FALSE       NA
#> 3                                         <NA>   fertile       FALSE       NA
#> 4                                         <NA>   fertile       FALSE       NA
#> 5                                         <NA>   fertile       FALSE       NA
#> 6                                         <NA>   fertile       FALSE       NA
#> 7                                         <NA>   fertile       FALSE       NA
#> 8                                         <NA>   fertile       FALSE       NA
#> 9                                         <NA>   fertile       FALSE       NA
#> 10                                        <NA>   fertile       FALSE       NA
#> 11                                        <NA>   fertile       FALSE       NA
#> 12 is-mother-and-father_is-father-but-not-male   fertile       FALSE       NA
#> 13                                        <NA>   fertile       FALSE       NA
#> 14                                        <NA>   fertile       FALSE       NA
#>    avail evaluated consultand proband carrier asymptomatic adopted dateofbirth
#> 1     NA     FALSE      FALSE   FALSE      NA           NA   FALSE        <NA>
#> 2     NA     FALSE      FALSE   FALSE      NA           NA   FALSE        <NA>
#> 3     NA     FALSE      FALSE   FALSE      NA           NA   FALSE        <NA>
#> 4     NA     FALSE      FALSE   FALSE      NA           NA   FALSE        <NA>
#> 5     NA     FALSE      FALSE   FALSE      NA           NA   FALSE        <NA>
#> 6     NA     FALSE      FALSE   FALSE      NA           NA   FALSE        <NA>
#> 7     NA     FALSE      FALSE   FALSE      NA           NA   FALSE        <NA>
#> 8     NA     FALSE      FALSE   FALSE      NA           NA   FALSE        <NA>
#> 9     NA     FALSE      FALSE   FALSE      NA           NA   FALSE        <NA>
#> 10    NA     FALSE      FALSE   FALSE      NA           NA   FALSE        <NA>
#> 11    NA     FALSE      FALSE   FALSE      NA           NA   FALSE        <NA>
#> 12    NA     FALSE      FALSE   FALSE      NA           NA   FALSE        <NA>
#> 13    NA     FALSE      FALSE   FALSE      NA           NA   FALSE        <NA>
#> 14    NA     FALSE      FALSE   FALSE      NA           NA   FALSE        <NA>
#>    dateofdeath
#> 1         <NA>
#> 2         <NA>
#> 3         <NA>
#> 4         <NA>
#> 5         <NA>
#> 6         <NA>
#> 7         <NA>
#> 8         <NA>
#> 9         <NA>
#> 10        <NA>
#> 11        <NA>
#> 12        <NA>
#> 13        <NA>
#> 14        <NA>