Well, to accurately define what each unit is, we need to peel away decades of marketing crap.
A core is nothing than more than a load/store unit (LDU) and arithmetic logic unit (ALU) consisting of an adder and a multiplier, that ALU is of integer precision and it's also called an execution unit, this is the simplest form of a core. That form of CPU core should be capable of doing one instruction per clock, however it's usually less than that, since it stalls often, and outputs less than one instruction per clock, anywhere from 0.1 to 0.9 instructions per clock (which is why it's sub scalar), we can also call it barely scalar at it's best case scenario.
It later evolved to have an additional address generation unit (AGU) to handle complex memory configurations, and a separate ALU unit for floating point operations, they called it FPU. Now that distinction here is very important, they could have called this a 2 core CPU as it had two execution units (one integer and one floating point), but they chose not, since the core still has one LDU and one AGU, it took data from a single thread, it also shared registers, caches and control logic between the units.
They evolved the core later with pipelining (dividing instructions into micro instructions), to achieve a guaranteed scalar operation (meaning the core can output a single instruction per clock) and a potential super scalar operation (more than a single instruction per clock).
Later they multiplied the execution units (eg: 4 ALUs, 2 FPUs, 4 LDUs, 2 AGUs) inside the core and maximized the super scalar operation to always always be more than 1, they refused to call this kind of core multi core since it still shared caches, registers and control between units, it also still only handled instructions from one thread, and it's output hovered around 1 instruction per clock (meaning 1.2 or 1.5 ...etc).
During all of this, a new form of computing was invented, called packed math, where the core loads different data that are the same type and fuses them into a single giant instruction, executing this large instruction using it's multiple execution units, it is effectively multiple data in a single instruction, or Single Instruction Multiple Data (SIMD). For example, instead of loading each pixel of an image serially, 4 pixels will be loaded as one giant instruction and executed as one, speeding up the execution considerably. However, SIMD is still not multi core because it operated within the confines of super scalar cores, and since it's a memory trick basically.
Later they dedicated different execution units (ALUs and FPUs) for SIMD, but they still did not call this multi core, because it still operated within the confines of a single thread and shared resources and caches between SIMD execution units and regular execution units. Examples of SIMD inside the CPU core include MMX, SSE, and AVX.
Later they added the ability to load two or more threads on the core, if one threads stalls another thread will take it's place, this is the Simultaneous Multi Threading (SMT) or Hyper Threading (HT) approach, it also maximizes the super scalar operation of the core, they still refused to call it multi core because it's conditional (relies on stalls), still shared the caches and it's final output also hovered around 1 instruction per clock (1.7, 1.9 .. etc).
So all the pipelining, multiple executions, SIMD, SMT are all to boost the instruction per clock of a single core, it now stands at 2 to 4 instructions per clocks per core in modern CPUs. This is how strict the definition of a core in the CPU world. And even though, we've deviated much from the original core definition, we refused to call any of those deviations multi core because they took data from a single thread, and shared resources (caches, registers, memory bus, control logic) with other units.
In the harsh world of CPU cores definition, a true multi core is multiple super scalar cores each operating on a different thread, having a giant cache share between all of them.
AMD tried to change the definition of a CPU core with Bulldozer, by counting the number of integer ALU blocks as true cores, while sharing FPU blocks between "these cores", but they backtracked from this.
A GPU core is "supposedly" the same as a CPU core, the difference is that a GPU core is a single execution unit, specifically a fused multiply add unit (FMA), that handles mixed precision (Integer and Floating Point). A GPU has thousands of these units each capable of operating on data from different threads. Which is why GPU vendors claim each execution unit as a core based on the distinction of doing different threads.
However, I am not convinced of this, the GPU has many types of "cores": texture, raster, shader, tensor, ray tracing, ... etc. Should we consider them all equal cores and include all of them in the final count? Furthermore, most of these "cores" share resources with other cores, which include register files, caches, and control logic.
If we apply the same strict CPU definitions, then the SM or CU (a large constellation of FMA execution units), is the true definition of a GPU core, it has a fixed number of shader units, texture, raster, tensor, ray tracing ..etc, all sharing the same register files, caches, and control logic.
Intel gets this, they don't call their small FMA shader units cores at all, they call them execution units, and calls their larger grouping a Xe Core, while NVIDIA calls it Streaming Multiprocessor (SM), and AMD calls it Compute Unit (CU). Each SM/CU/Xe core has multiple execution units (FMA) units operating under the principle of SIMD (or SIMT), single instruction multiple data/threads. Each SM/CU/Xe core also employs different types of computation units, tensor units are just mixed precision FMA Systolic Arrays or (General Matrix Multiply Accumulate) accelerators, texture units are samplers with address generators, ray tracing units are MIMD units .. etc.
So a 4090 doesn't really have 16000 cores, it only has 128 SMs/cores. Arc 770 has 32 Xe cores, the 3060 has 28 SMs/cores .. etc. If you look at the times of old, the Geforce 256 (the first GPU), didn't have cores, only pixel "shaders", it's only after the advent of CUDA and combining of vertex and pixel shaders into one giant FMA unit did NVIDIA call these units "cores".