Skip to main content

Network Scale-Up Models for Aggregated Relational Data

Project description

This package fits several different Network Scale-Up Models (NSUM) to Aggregated Relational Data (ARD). ARD represents survey responses to questions of the form: "How many X’s do you know?", where respondents report how many people they know in different subpopulations.

Specifically, if Nᵢ respondents are asked about Nₖ subpopulations, then the ARD is an Nᵢ times Nₖ matrix, where the (i, j) element represents how many people respondent i reports knowing in subpopulation j.

NSUM leverages these responses to estimate the unknown size of hard-to-reach populations.

In this package, we provide functions to estimate the size and accompanying parameters (e.g. degrees) from 2 papers:

Killworth, P. D., Johnsen, E. C., McCarty, C., Shelley, G. A., and Bernard, H. R. (1998) plug-in MLE

Killworth, P. D., McCarty, C., Bernard, H. R., Shelley, G. A., and Johnsen, E. C. (1998) MLE

Requirements

This package requires the following Python libraries:

  • numpy
  • pandas

PIMLE

The plug-in MLE (PIMLE) estimator from Killworth, P. D., Johnsen, E. C., McCarty, C., Shelley, G. A., and Bernard, H. R. (1998) is a two-stage estimator that first estimates the degrees for each respondent dᵢ by maximizing the following likelihood for each respondent:

L(dᵢ; y, {Nₖ}) = ∏ₖ₌₁ᴸ [ (⁽ᵈⁱ⁾⁄₍ʸⁱₖ₎) × (Nₖ / N)yᵢₖ × (1 − Nₖ / N)dᵢ − yᵢₖ ],

Where: L is the number of subpopulations with known sizes Nₖ. yᵢₖ is the number of people respondent i reports knowing in subpopulation k. (⁽ᵈⁱ⁾⁄₍ʸⁱₖ₎) is the binomial coefficient. In the second stage, the model plugs in the estimated dᵢ into the equation:

yᵢₖ / dᵢ = Nₖ / N
and solves for the unknown *Nₖ* for each respondent. These estimates are then averaged to obtain a single estimate of *Nₖ*.

To summarize, Stage 1 estimates dᵢ using:

dᵢ = N × (∑ₖ₌₁ᴸ yᵢₖ) / (∑ₖ₌₁ᴸ Nₖ)

Stage 2 estimates the unknown subpopulation size Nₖ with:

N̂ₖᴾᴵᴹᴸᴱ = (N / n) × ∑ᵢ₌₁ⁿ (yᵢₖ / dᵢ)

Here is an example of this package creating an estimate using the PIMLE function:

pimle.est = killworth(ard, known_sizes = sizes[c(1, 2, 4)], known_ind = c(1, 2, 4), N = N, model = "PIMLE")

Note that the function will provide a warning saying that at least dᵢ was 0. This occurs when a respondent does not resport knowing anyone in the known subpopulations. This is an issue for the PIMLE since a 0 value is in the denominator for N̂ᵤᴾᴵᴹᴸᴱ . Thus, we ignore the responses from respondents that correspond to dᵢ =0 .

MLE

Next, we analyze the data from the Killworth, P. D., McCarty, C., Bernard, H. R., Shelley, G. A., and Johnsen, E. C. (1998) MLE estimator. This is also a two-stage model, which an identical first stage, i.e.

d̂ⁱ = N̂ ⋅ (∑ₖ₌₁ᴸ yᵢₖ) / (∑ₖ₌₁ᴸ Nₖ)

However, the second stage estimates Nₖ by maximizing the Binomial likelihood with respect to Nₖ , fixing dᵢ at the estimated d̂ᵢ . Thus, the estimate for the unknown subpopulation size is given by

N̂⁽ᴹᴸᴱ⁾ₖ = N ⋅ (∑ᵢ₌₁ⁿ yᵢₖ) / (∑ᵢ₌₁ⁿ d̂ᵢ)

For example, the estimate can be obtained using:

mle.est = killworth(ard, known_sizes = sizes[c(1, 2, 4)], known_ind = c(1, 2, 4), N = N, model = "MLE")

Note that this function will not create a warning for a dᵢ =0 value since the denominator depends on the summation of d̂ᵢ.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

networkscaleup-0.0.7.tar.gz (4.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

networkscaleup-0.0.7-py3-none-any.whl (4.9 kB view details)

Uploaded Python 3

File details

Details for the file networkscaleup-0.0.7.tar.gz.

File metadata

  • Download URL: networkscaleup-0.0.7.tar.gz
  • Upload date:
  • Size: 4.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for networkscaleup-0.0.7.tar.gz
Algorithm Hash digest
SHA256 1ee80ddab80bec57493058221f87c1528fc11d4d9c951d14dacd35a10248a0a5
MD5 d86093a2bb33ac2d17534f0fd1423a8b
BLAKE2b-256 0a6dc290c5ca46273fb93ca2f4ca73fb42ead9332765af3d5bbf798f1748c6cd

See more details on using hashes here.

File details

Details for the file networkscaleup-0.0.7-py3-none-any.whl.

File metadata

  • Download URL: networkscaleup-0.0.7-py3-none-any.whl
  • Upload date:
  • Size: 4.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for networkscaleup-0.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 69ad9302a53a80db020a02e397a96653cf82e40a2b9cf66b101bf9878ef477ff
MD5 b501a53981e579a41c857fd4b9952a6c
BLAKE2b-256 c4f373c37f6dcc7d8b9a979f904ebe470c9774b1fa1da32b29ca21411b6d3662

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page