this post was submitted on 14 Nov 2024
24 points (100.0% liked)

No Stupid Questions

35822 readers
888 users here now

No such thing. Ask away!

!nostupidquestions is a community dedicated to being helpful and answering each others' questions on various topics.

The rules for posting and commenting, besides the rules defined here for lemmy.world, are as follows:

Rules (interactive)


Rule 1- All posts must be legitimate questions. All post titles must include a question.

All posts must be legitimate questions, and all post titles must include a question. Questions that are joke or trolling questions, memes, song lyrics as title, etc. are not allowed here. See Rule 6 for all exceptions.



Rule 2- Your question subject cannot be illegal or NSFW material.

Your question subject cannot be illegal or NSFW material. You will be warned first, banned second.



Rule 3- Do not seek mental, medical and professional help here.

Do not seek mental, medical and professional help here. Breaking this rule will not get you or your post removed, but it will put you at risk, and possibly in danger.



Rule 4- No self promotion or upvote-farming of any kind.

That's it.



Rule 5- No baiting or sealioning or promoting an agenda.

Questions which, instead of being of an innocuous nature, are specifically intended (based on reports and in the opinion of our crack moderation team) to bait users into ideological wars on charged political topics will be removed and the authors warned - or banned - depending on severity.



Rule 6- Regarding META posts and joke questions.

Provided it is about the community itself, you may post non-question posts using the [META] tag on your post title.

On fridays, you are allowed to post meme and troll questions, on the condition that it's in text format only, and conforms with our other rules. These posts MUST include the [NSQ Friday] tag in their title.

If you post a serious question on friday and are looking only for legitimate answers, then please include the [Serious] tag on your post. Irrelevant replies will then be removed by moderators.



Rule 7- You can't intentionally annoy, mock, or harass other members.

If you intentionally annoy, mock, harass, or discriminate against any individual member, you will be removed.

Likewise, if you are a member, sympathiser or a resemblant of a movement that is known to largely hate, mock, discriminate against, and/or want to take lives of a group of people, and you were provably vocal about your hate, then you will be banned on sight.



Rule 8- All comments should try to stay relevant to their parent content.



Rule 9- Reposts from other platforms are not allowed.

Let everyone have their own content.



Rule 10- Majority of bots aren't allowed to participate here.



Credits

Our breathtaking icon was bestowed upon us by @Cevilia!

The greatest banner of all time: by @TheOneWithTheHair!

founded 1 year ago
MODERATORS
 

Basically exactly what the title says. In case there isn't a great place, or this post ends up getting more visibility than wherever I end up asking I will explain my approximate competency level and the question below.

In terms of competency I have an engineering background and degree, which means I had a single class in statistics. Technically I was one class short of a math minor (Graph Theory) when I graduated. Unlike most engineers and Six Sigma "graduates" I don't think this automatically makes me some kind of math/stats wizard. I'm aware I know just enough that I can unintentionally massage data to fit my bias (mini rant over).

My question is, when looking at a human population and trying to find the approximate subset of people with certain attributes how are correlations handled to avoid double counting?

For example let's say I am looking at a specific city and my data sets are thee most recent census, BLS.gov, and Pew Research. With the above sources I can pretty easily estimate something along the lines of

The number men in a US city that are:

  • Between the ages of 22-44
  • Have a STEM degree

However, if I then wanted to add another factor:

  • Are/Vote liberal

I know that is going to interfere with the original criteria because higher levels of education are correlated with people being more liberal, thus if I just punched in the percentages from all three data points the resulting number is likely going to be much smaller than reality.

Is there a term or method I can read up on for how to account for overlaps/correlations between population subsets? Does this make sense or am I asking the wrong kind of question?

FWIW none of this is related to my job, an argument, a shit post, a data graphic, or anything else I will ever really make. It's just for something specific (not the actually the above example but something like it using the sources I mentioned) I am personally curious about. I have also more generally been wondering about how to account for this kind of overlap for a couple of years now.

Regardless, thanks for taking the time to at least read all this.

Cheers!

you are viewing a single comment's thread
view the rest of the comments
[–] ptz@dubvee.org 5 points 3 days ago* (last edited 3 days ago)

The only one I can really see is !statistics@lemmy.world but it doesn't appear to be active at all. The only moderator for it hasn't posted anything in a year or so.

Looks like some "if you build it, they will come" is needed.

Not sure of the procedure, but you may reach out to the LW admins to see about taking over the community if the mod is confirmed AWOL.